skip to main content
10.1145/2628071.2628094acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

DeSTM: harnessing determinism in STMs for application development

Published:24 August 2014Publication History

ABSTRACT

Non-determinism has long been recognized as one of the key challenges which restrict parallel programmer productivity by complicating several phases of application development. While Software Transactional Memory (STM) systems have greatly improved the productivity of programmers developing parallel applications in a host of areas they still exhibit non-deterministic behavior leading to decreased productivity. While determinism in parallel applications which use traditional synchronization primitives (such as locks) has been relatively well studied, its interplay with STMs has not. In this paper we present DeSTM, a deterministic STM, which allows programmers to leverage determinism through the implementation, debugging and testing phases of application development.

In this work we first adapt techniques which introduce determinism in applications which use traditional synchronization (such as locks) to work in conjunction with certain STMs. As one would expect, this does lead to performance degradation over a non-deterministic execution. Next we present, DeSTM, which uses novel techniques exploiting the properties of these STMs to dramatically improve the performance of deterministic executions. Further, DeSTM allows programmers to randomly change the deterministic schedule in a controlled fashion giving programmers access to a wide variety of execution schedules during application development.

We demonstrate our approach on the STAMP benchmark suite. We first study the overheads that determinism introduces in STM applications and then demonstrate how DeSTM is able to improve performance of deterministic execution significantly, by over 50% in some applications and on average by about 35%. DeSTM also actually helped us detect, what we believe is a bug, in one of the benchmarks. Further, our approach is programmer friendly requiring no changes to application code.

References

  1. A. Aviram, S.-C. Weng, S. Hu, and B. Ford. Efficient system-enforced deterministic parallelism. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI'10, pages 1--16, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Bergan, O. Anderson, J. Devietti, L. Ceze, and D. Grossman. Coredet: A compiler and runtime system for deterministic multithreaded execution. In Proceedings of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, ASPLOS XV, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: Safe multithreaded programming for c/c++. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA '09, pages 81--96, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. E. Blelloch, J. T. Fineman, P. B. Gibbons, and J. Shun. Internally deterministic parallel algorithms can be fast. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. L. Bocchino, Jr., V. S. Adve, D. Dig, S. V. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for deterministic parallel java. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA '09, pages 97--116, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Cao Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In IISWC '08: Proceedings of The IEEE International Symposium on Workload Characterization, September 2008.Google ScholarGoogle ScholarCross RefCross Ref
  7. G.-I. Cheng, M. Feng, C. E. Leiserson, K. H. Randall, and A. F. Stark. Detecting data races in cilk programs that use locks. In Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA '98, pages 298--309, New York, NY, USA, 1998. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Devietti, B. Lucia, L. Ceze, and M. Oskin. Dmp: Deterministic shared memory multiprocessing. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIV, pages 85--96, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Devietti, J. Nelson, T. Bergan, L. Ceze, and D. Grossman. Rcdc: A relaxed consistency deterministic computer. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 67--78, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Dice, O. Shalev, and N. Shavit. Transactional locking ii. In Proceedings of the 20th International Conference on Distributed Computing, DISC'06, pages 194--208, Berlin, Heidelberg, 2006. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the cilk-5 multithreaded language. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98, pages 212--223, New York, NY, USA, 1998. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. B. Gibbons. A more practical pram model. In Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA '89, pages 158--168, New York, NY, USA, 1989. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. E. Gottschlich, M. P. Herlihy, G. A. Pokam, and J. G. Siek. Visualizing transactional memory. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, PACT '12, pages 159--170, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. E. Gottschlich, R. Knauerhase, and G. Pokam. But how do we really debug transactional memory programs? In Presented as part of the 5th USENIX Workshop on Hot Topics in Parallelism, Berkeley, CA, 2013. USENIX.Google ScholarGoogle Scholar
  15. J. E. Gottschlich, M. Vachharajani, and J. G. Siek. An efficient software transactional memory using commit-time invalidation. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '10, pages 101--110, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Harris, J. Larus, and R. Rajwar. Transactional Memory, 2nd Edition. Morgan and Claypool Publishers, 2nd edition, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. R. Hower, P. Dudnik, M. D. Hill, and D. A. Wood. Calvin: Deterministic or not? free will to choose. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture, HPCA '11, pages 333--334, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. M. Karp and R. E. Miller. Parallel program schemata. J. Comput. Syst. Sci., 3(2):147--195, May 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Korl, N. Shavit, and P. Felber. Noninvasive concurrency with java stm.Google ScholarGoogle Scholar
  20. T. J. LeBlanc and J. M. Mellor-Crummey. Monitoring and debugging of distributed real-time systems. chapter Debugging Parallel Programs with Instant Replay, pages 301--311. IEEE Computer Society Press, Los Alamitos, CA, USA, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Liu, C. Curtsinger, and E. D. Berger. Dthreads: Efficient deterministic multithreading. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 327--336, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Lu, X. Zhou, T. Bergan, and X. Wang. Efficient deterministic multithreading without global barriers. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, pages 287--300, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Merrifield and J. Eriksson. Conversion: Multi-version concurrency control for main memory segments. In Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys '13, pages 127--139, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Montesinos, L. Ceze, and J. Torrellas. Delorean: Recording and deterministically replaying shared-memory multiprocessor execution efficiently. In Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA '08, pages 289--300, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Narayanasamy, C. Pereira, and B. Calder. Recording shared memory dependencies using strata. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XII, pages 229--240, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: Efficient deterministic multithreading in software. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIV, pages 97--108, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. S. Patil. Record of the project mac conference on concurrent systems and parallel computation. chapter Closure Properties of Interconnections of Determinate Systems, pages 107--116. ACM, New York, NY, USA, 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Perelman, A. Byshevsky, O. Litmanovich, and I. Keidar. Smv: Selective multi-versioning stm. In Proceedings of the 25th International Conference on Distributed Computing, DISC'11, pages 125--140, Berlin, Heidelberg, 2011. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Ruppert. A delaunay refinement algorithm for quality 2-dimensional mesh generation. In Selected Papers from the Fourth Annual ACM SIAM Symposium on Discrete Algorithms, SODA '93, pages 548--585, Orlando, FL, USA, 1995. Academic Press, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. Shavit and D. Touitou. Software transactional memory. In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing, PODC '95, pages 204--213, New York, NY, USA, 1995. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. V. Smiljkovic, C. Fetzer, O. Unsal, A. Cristal, and M. Valero. Deterministic execution of tm-based applications, 2013. Abstract: http://www.eurotm.org/action-meetings/wtm2013/program/abstracts, Related Report: http://www.gsd.inesc-id.pt/~mcouceiro/eurotm/stsm/smiljkovic.pdf, Related presentation: http://www.gsd.inesc-id.pt/~mcouceiro/eurotm/wtm2013/smiljkovic.pdf.Google ScholarGoogle Scholar
  32. G. L. Steele, Jr. Making asynchronous parallelism safe for the world. In Proceedings of the 17th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '90, pages 218--231, New York, NY, USA, 1990. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Thies, M. Karczmarek, and S. P. Amarasinghe. Streamit: A language for streaming applications. In Proceedings of the 11th International Conference on Compiler Construction, CC '02, pages 179--196, London, UK, UK, 2002. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. TinySTM. http://www.tmware.org/tinystm.Google ScholarGoogle Scholar
  35. M. Xu, R. Bodik, and M. D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. In Proceedings of the 30th Annual International Symposium on Computer Architecture, ISCA '03, pages 122--135, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Yu and S. Narayanasamy. A case for an interleaving constrained shared-memory multi-processor. In Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA '09, pages 325--336, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. F. Zyulkyarov, T. Harris, O. S. Unsal, A. Cristal, and M. Valero. Debugging programs that use atomic blocks and transactional memory. In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '10, pages 57--66, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. F. Zyulkyarov, S. Stipic, T. Harris, O. S. Unsal, A. Cristal, I. Hur, and M. Valero. Discovering and understanding performance bottlenecks in transactional applications. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT '10, pages 285--294, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DeSTM: harnessing determinism in STMs for application development

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation
        August 2014
        514 pages
        ISBN:9781450328098
        DOI:10.1145/2628071

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 August 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        PACT '14 Paper Acceptance Rate54of144submissions,38%Overall Acceptance Rate121of471submissions,26%

        Upcoming Conference

        PACT '24
        International Conference on Parallel Architectures and Compilation Techniques
        October 14 - 16, 2024
        Southern California , CA , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader