ABSTRACT
Deterministic replay tools help programmers debug concurrent programs. However, for long-running programs, a replay tool may generate huge log of shared memory access dependences. In this paper, we present CARE, an application-level deterministic record and replay technique to reduce the log size. The key idea of CARE is logging read-write dependences only at per-thread value prediction cache misses. This strategy records only a subset of all exact read-write dependences, and reduces synchronizations protecting memory reads in the instrumented code. Realizing that such record strategy provides only value-deterministic replay, CARE also adopts variable grouping and action prioritization heuristics to synthesize sequentially consistent executions at replay in linear time. We implemented CARE in Java and experimentally evaluated it with recognized benchmarks. Results showed that CARE successfully resolved all missing read-write dependences, producing sequentially consistent replay for all benchmarks. CARE exhibited 1.7--40X (median 3.4X) smaller runtime overhead, and 1.1--309X (median 7.0X) smaller log size against state-of-the-art technique LEAP.
- ASM toolkit for bytecode manipulation. http://asm.ow2. org/.Google Scholar
- JVM tool interface. http://docs.oracle.com/ javase/7/docs/platform/jvmti/jvmti.html.Google Scholar
- G. Altekar and I. Stoica. ODR: output-deterministic replay for multicore debugging. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP, pp. 193–206, 2009. Google ScholarDigital Library
- D. F. Bacon and S. C. Goldstein. Hardware-assisted replay of multiprocessor programs. In Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging, PADD, pp. 194–206, 1991. Google ScholarDigital Library
- A. Basu, J. Bobba, and M. D. Hill. Karma: Scalable deterministic record-replay. In Proceedings of the international conference on Supercomputing, ICS, pp. 359–368, 2011. Google ScholarDigital Library
- J. Bell, N. Sarda, and G. Kaiser. Chronicler: Lightweight recording to reproduce field failures. In Proceedings of the 2013 International Conference on Software Engineering, ICSE, pp. 362–371, 2013. Google ScholarDigital Library
- T. Bergan, O. Anderson, J. Devietti, L. Ceze, and D. Grossman. CoreDet: a compiler and runtime system for deterministic multithreaded execution. In Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, ASPLOS, pp. 53–64, 2010. Google ScholarDigital Library
- S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray, M. Drini´c, D. Mihoˇcka, and J. Chau. Framework for instruction-level tracing and analysis of program executions. In Proceedings of the 2nd international conference on Virtual execution environments, VEE, pp. 154–163, 2006. Google ScholarDigital Library
- Y. Chen and H. Chen. Scalable deterministic replay in a parallel full-system emulator. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP, pp. 207–218, 2013. Google ScholarDigital Library
- A. Cheung, A. Solar-Lezama, and S. Madden. Partial replay of long-running applications. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, ESEC/FSE, pp. 135–145, 2011. Google ScholarDigital Library
- J. Devietti, B. Lucia, L. Ceze, and M. Oskin. DMP: deterministic shared memory multiprocessing. In Proceedings of the 14th international conference on Architectural support for programming languages and operating systems, ASPLOS, pp. 85–96, 2009. Google ScholarDigital Library
- G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: enabling intrusion analysis through virtualmachine logging and replay. In Proceedings of the 5th symposium on Operating systems design and implementation, OSDI, pp. 211–224, 2002. Google ScholarDigital Library
- G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, VEE, pp. 121–130, 2008. Google ScholarDigital Library
- P. B. Gibbons and E. Korach. Testing shared memories. SIAM J. Comput., 26(4):1208–1244, Aug. 1997. Google ScholarDigital Library
- L. Gomez, I. Neamtiu, T. Azim, and T. Millstein. RERAN: Timing- and touch-sensitive record and replay for android. In Proceedings of the 2013 International Conference on Software Engineering, ICSE, pp. 72–81, 2013. Google ScholarDigital Library
- N. Honarmand, N. Dautenhahn, J. Torrellas, S. T. King, G. Pokam, and C. Pereira. Cyrus: Unintrusive applicationlevel record-replay for replay parallelism. In Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems, ASPLOS, pp. 193–206, 2013. Google ScholarDigital Library
- D. Hower, P. Dudnik, M. Hill, and D. Wood. Calvin: Deterministic or not? Free will to choose. In 2011 IEEE 17th International Symposium on High Performance Computer Architecture, HPCA, pp. 333–334, 2011. Google ScholarDigital Library
- J. Huang, P. Liu, and C. Zhang. LEAP: Lightweight deterministic multi-processor replay of concurrent Java programs. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, FSE, pp. 207–216, 2010. Google ScholarDigital Library
- J. Huang, C. Zhang, and J. Dolby. CLAP: Recording local executions to reproduce concurrency failures. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation, PLDI, pp. 141–152, 2013. Google ScholarDigital Library
- N. Jalbert and K. Sen. A trace simplification technique for effective debugging of concurrent programs. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, FSE, pp. 57–66, 2010. Google ScholarDigital Library
- Y. Jiang, C. Xu, and X. Ma. DPAC: An infrastructure for dynamic program analysis of concurrency Java programs. In Proceedings of the 2013 Middleware Doctoral Symposium, 2013. Google ScholarDigital Library
- P. Joshi, C.-S. Park, K. Sen, and M. Naik. A randomized dynamic program analysis technique for detecting real deadlocks. In Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation, PLDI, pp. 110–120, 2009. Google ScholarDigital Library
- S. T. King, G. W. Dunlap, and P. M. Chen. Debugging operating systems with time-traveling virtual machines. In Proceedings of the annual conference on USENIX Annual Technical Conference, ATEC, 2005. Google ScholarDigital Library
- T. LeBlanc and J. Mellor-Crummey. Debugging parallel programs with instant replay. Computers, IEEE Transactions on, C-36(4):471–482, 1987. Google ScholarDigital Library
- D. Lee, P. M. Chen, J. Flinn, and S. Narayanasamy. Chimera: Hybrid program analysis for determinism. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, PLDI, pp. 463–474, 2012. Google ScholarDigital Library
- K. H. Lee, Y. Zheng, N. Sumner, and X. Zhang. Toward generating reducible replay logs. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation, PLDI, pp. 246–257, 2011. Google ScholarDigital Library
- J. Manson, W. Pugh, and S. V. Adve. The Java memory model. In Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL, pp. 378–391, 2005. Google ScholarDigital Library
- R. H. B. Netzer. Optimal tracing and replay for debugging shared-memory parallel programs. In Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging, PADD, pp. 1–11, 1993. Google ScholarDigital Library
- S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. PRES: probabilistic replay with execution sketching on multiprocessors. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP, pp. 177– 192, 2009. Google ScholarDigital Library
- K. Sen. Race directed random testing of concurrent programs. In Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation, PLDI, pp. 11–21, 2008. Google ScholarDigital Library
- Y. Smaragdakis, J. Evans, C. Sadowski, J. Yi, and C. Flanagan. Sound predictive race detection in polynomial time. In Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL, pp. 387–400, 2012. Google ScholarDigital Library
- J. Tucek, S. Lu, C. Huang, S. Xanthos, and Y. Zhou. Triage: Diagnosing production run failures at the user’s site. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, SOSP, pp. 131–144, 2007. Google ScholarDigital Library
- D. Weeratunge, X. Zhang, and S. Jagannathan. Analyzing multicore dumps to facilitate concurrency bug reproduction. In Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, ASPLOS, pp. 155–166, 2010. Google ScholarDigital Library
- M. Xu, M. D. Hill, and R. Bodik. A regulated transitive reduction (RTR) for longer memory race recording. In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, ASPLOS, pp. 49–60, 2006. Google ScholarDigital Library
- Z. Yang, M. Yang, L. Xu, H. Chen, and B. Zang. ORDER: Object centric deterministic replay for java. In Proceedings of the 2011 USENIX conference on USENIX annual technical conference, ATEC, pp. 30–43, 2011. Google ScholarDigital Library
- C. Zamfir and G. Candea. Execution synthesis: A technique for automated software debugging. In Proceedings of the 5th European conference on Computer systems, EuroSys, pp. 321– 334, 2010. Google ScholarDigital Library
- J. Zhou, X. Xiao, and C. Zhang. Stride: Search-based deterministic replay in polynomial time via bounded linkage. In Proceedings of the 2012 International Conference on Software Engineering, ICSE, pp. 892–902, 2012. Google ScholarDigital Library
Index Terms
- CARE: cache guided deterministic replay for concurrent Java programs
Recommendations
Debugging support for multi-paradigm concurrent programs
SPLASH Companion 2019: Proceedings Companion of the 2019 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for HumanityWith the widespread adoption of concurrent programming, debugging of non-deterministic failures becomes increasingly important. Record & replay debugging aids developers in this effort by reliably reproducing recorded bugs. Because each concurrency ...
Efficient and deterministic record & replay for actor languages
ManLang '18: Proceedings of the 15th International Conference on Managed Languages & RuntimesWith the ubiquity of parallel commodity hardware, developers turn to high-level concurrency models such as the actor model to lower the complexity of concurrent software. However, debugging concurrent software is hard, especially for concurrency models ...
Timetraveler: exploiting acyclic races for optimizing memory race recording
ISCA '10: Proceedings of the 37th annual international symposium on Computer architectureAs chip multiprocessors emerge as the prevalent microprocessor architecture, support for debugging shared-memory parallel programs becomes important. A key difficulty is the programs' nondeterministic semantics due to which replay runs of a buggy ...
Comments