skip to main content
10.1145/2393596.2393650acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

DTAM: dynamic taint analysis of multi-threaded programs for relevancy

Published: 11 November 2012 Publication History

Abstract

Testing and debugging multi-threaded programs are notoriously difficult due to non-determinism not only in inputs but also in OS schedules. In practice, dynamic analysis and failure replay systems instrument the program to record events of interest in the test execution, e.g., program inputs, accesses to shared objects, synchronization operations, context switches, etc. To reduce the overhead of logging during runtime, these testing and debugging efforts have proposed tradeoffs for sampling or selective logging, at the cost of reducing coverage or performing more expensive search offline.
We propose to identify a subset of input sources and shared objects that are, in a sense, relevant for covering program behavior. We classify various types of relevancy in terms of how an input source or a shared object can affect control flow (e.g., a conditional branch) or dataflow (e.g., state of the shared objects) in the program. Such relevancy data can be used by testing and debugging methods to reduce their recording overhead and to guide coverage.
To conduct relevancy analysis, we propose a novel framework based on <u>d</u>ynamic <u>t</u>aint <u>a</u>nalysis for <u>m</u>ulti-threaded programs, called DTAM. It performs thread-modular taint analysis for each thread in parallel during runtime, and then aggregates the thread-modular results offline. This approach has many advantages: (a) it is faster than conducting taint analysis for serialized multi-threaded executions, (b) it can compute results for alternate thread interleavings by generalizing the observed execution, and (c) it provides a knob to tradeoff precision with coverage, depending on how thread-modular results are aggregated to account for alternate interleavings. We have implemented DTAM and performed an experimental evaluation on publicly available benchmarks for relevancy analysis. Our experiments show that most shared accesses and conditional branches are dependent on some program input sources. Interestingly in our test runs, on average, only about 25% input sources and 3% shared objects affect other shared accesses through conditional branches. Thus, it is important to identify such relevant input sources and shared objects for testing and debugging.

References

[1]
G. Altekar and I. Stoica. ODR: Output-deterministic replay for multicore debugging. In Symposium on Operating Systems Principles, 2009.
[2]
F. Chen, T. F. Serbanuta, and G. Rosu. jPredictor: A predictive runtime analysis tool for Java. In Proc. of ICSE, 2008.
[3]
S. Chen, M. Kozuch, T. Strigkos, B. Falsafi, P. B. Gibbons, T. C. Mowry, V. Ramachandran, O. Ruwase, M. Ryan, and E. Vlachos. Flexible hardware acceleration for instruction-grain program monitoring. In Proc. of ISCA, 2008.
[4]
W. Cheng, Q. Zhao, B. Yu, and S. Hiroshige. Tainttrace: Efficient flow tracing with dynamic binary rewriting. In Proc. of ISCC, 2006.
[5]
J. Clause, W. Li, and A. Orso. Dytan: a generic dynamic taint analysis framework. In Proc. of ISSTA, 2007.
[6]
M. Dalton, H. Kannan, and C. Kozyrakis. Raksha: a flexible information flow architecture for software security. In Proc. of ISCA, 2007.
[7]
J. Devietti, B. Lucia, L. Ceze, and M. Oskin. Dmp: deterministic shared memory multiprocessing. In Proc. of ASPLOS, 2009.
[8]
M. Egele, C. Kruegel, E. Kirda, H. Yin, and D. Song. Dynamic spyware analysis. In Proc. of USENIX ATC, 2007.
[9]
J. Fidge. Timestamps in message-passing systems that preserve the partial ordering. In Australian Computer Science Conference, 1988.
[10]
C. Flanagan and S. N. Freund. FastTrack: Efficient and precise dynamic race detection. In Proc. of PLDI, 2009.
[11]
C. Flanagan and P. Godefroid. Dynamic partial-order reduction for model checking software. In Proc. of POPL, 2005.
[12]
M. Ganai. Scalable and precise symbolic analysis for atomicity violations. In Proc. of ASE, 2011.
[13]
M. K. Ganai and A. Gupta. Efficient modeling of concurrent systems in BMC. In Proc. of SPIN Workshop, 2008.
[14]
M. L. Goodstein, E. Vlachos, S. Chen, P. B. Gibbons, M. A. Kozuch, and T. C. Mowry. Butterfly analysis: adapting dataflow analysis to dynamic parallel monitoring. In Proc. of ASPLOS, 2010.
[15]
A. Ho, M. Fetterman, C. Clark, A. Warfield, and S. Hand. Practical taint-based protection using demand emulation. In Proc. of EUROSYS, 2006.
[16]
G. Inc. Freshmeat. http://freshmeat.net.
[17]
G. Inc. SourceForge. http://sourceforge.net.
[18]
A. Lal and T. Reps. Reducing concurrent analysis under a context bound to sequential analysis. In Proc. of CAV, 2008.
[19]
D. Lee, B. Wester, K. Veeraraghavan, P. M. Chen, J. Flinn, and S. Narayanasamy. Respec: Efficient online multiprocessor replay via speculation and external determinism. In Proc. of ASPLOS, 2010.
[20]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proc. of PLDI, 2005.
[21]
D. Marino, M. Musuvathi, and S. Narayanasamy. Literace: effective sampling for lightweight data-race detection. In Proc. of PLDI, 2009.
[22]
F. Mattern. Virtual time and global states of distributed systems. In Workshop on Parallel and Distributed Algorithms, France, 1988.
[23]
M. Musuvathi and S. Qadeer. Chess: systematic stress testing of concurrent software. In Proc. of LOPSTER, 2007.
[24]
A. C. Myers. JFlow: Practical mostly-static information flow control. In Proc. of POPL, 1999.
[25]
J. Newsome and D. Song. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In Proc. of NDSS, 2005.
[26]
M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: efficient deterministic multithreading in software. In Proc. of ASPLOS, 2009.
[27]
C.-S. Park and K. Sen. Randomized active atomicity violation detection in concurrent programs. In Proc. of FSE, 2008.
[28]
S. Park, S. Lu, and Y. Zhou. CTrigger: exposing atomicity violation bugs from their hiding places. In Proc. of ASPLOS, 2009.
[29]
S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. PRES: Probabilistic replay with execution sketching on multiprocessors. In Symposium on Operating Systems Principles, 2009.
[30]
E. Pozniansky and A. Schuster. MultiRace: Efficient on-the-fly data race detection in multithreaded C++ programs. In Proc. of Concurrency and Computation: Practice and Experience, 2007.
[31]
Princeton. The parsec benchmark suite. http://parsec.cs.princeton.edu/.
[32]
S. Qadeer, S. K. Rajamani, and J. Rehof. Summarizing procedures in concurrent programs. In Proc. of POPL, 2004.
[33]
F. Qin, C. Wang, Z. Li, H.-s. Kim, Y. Zhou, and Y. Wu. Lift: A low-overhead practical information flow tracking system for detecting security attacks. In Proc. of MICRO, 2006.
[34]
E. J. Schwartz, T. Avgerinos, and D. Brumley. All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In IEEE Symposium on Security and Privacy, 2010.
[35]
T. F. Serbanuta, F. Chen, and G. Rosu. Maximal causal models for multithreaded systems. Technical Report UIUCDCS-R-2008-3017, University of Illinois at Urbana-Champaign, 2008.
[36]
N. Sinha and C. Wang. Staged concurrent program analysis. In Proc. of FSE, 2010.
[37]
Y. Smaragdakis, J. M. Evans, C. Sadowski, J. Yi, and C. Flanagan. Sound predictive race detection in polynomial time. In Proc. of POPL, 2012.
[38]
G. E. Suh, J. W. Lee, D. Zhang, and S. Devadas. Secure program execution via dynamic information flow tracking. In Proc. of ASPLOS, 2004.
[39]
O. Tripp, M. Pistoia, S. J. Fink, M. Sridharan, and O. Weisman. TAJ: effective taint analysis of web applications. In Proc. of PLDI, 2009.
[40]
K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. DoublePlay: Parallelizing sequential logging and replay. In Proc. of ASPLOS, 2011.
[41]
G. Venkataramani, I. Doudalis, Y. Solihin, and M. Prvulovic. Flexitaint: A programmable accelerator for dynamic taint propagation. In Proc. of HPCA, 2008.
[42]
C. Wang, R. Limaye, M. Ganai, and A. Gupta. Trace-based symbolic analysis for atomicity violations. In Proc. of TACAS, 2010.
[43]
D. Weeratunge, X. Zhang, and S. Jagannathan. Analyzing multicore dumps to facilitate concurrency bug reproduction. In Proc. of ASPLOS, 2010.
[44]
W. Xu, E. Bhatkar, and R. Sekar. Taint-enhanced policy enforcement: A practical approach to defeat a wide range of attacks. In Proc. of USENIX Security Symposium, 2006.
[45]
J. Yi, C. Sadowski, and C. Flanagan. SideTrack: Generalizing dynamic atomicity analysis. In Proc. of PADTAD, 2009.
[46]
H. Yin, D. Song, M. Egele, C. Kruegel, and E. Kirda. Panorama: capturing system-wide information flow for malware detection and analysis. In Proc. of CCS, 2007.
[47]
D. Y. Zhu, J. Jung, D. Song, T. Kohno, and D. Wetherall. Tainteraser: protecting sensitive data leaks using application-level taint tracking. SIGOPS Oper. Syst. Rev., 45:142--154, February 2011.

Cited By

View all
  • (2024)Efficient Deadlock Detection in MPI Programs with Path Compression and Focus MatchingProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3674822(467-476)Online publication date: 24-Jul-2024
  • (2023)MirrorTaint: Practical Non-Intrusive Dynamic Taint Tracking for JVM-Based Microservice SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00210(2514-2526)Online publication date: 14-May-2023
  • (2023)U-VIP-SLAM: Underwater Visual-Inertial-Pressure SLAM for Navigation of Turbid and Dynamic EnvironmentsArabian Journal for Science and Engineering10.1007/s13369-023-07906-6Online publication date: 27-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
FSE '12: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
November 2012
494 pages
ISBN:9781450316149
DOI:10.1145/2393596
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. generalization
  2. relevancy
  3. taint analysis

Qualifiers

  • Research-article

Conference

SIGSOFT/FSE'12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient Deadlock Detection in MPI Programs with Path Compression and Focus MatchingProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3674822(467-476)Online publication date: 24-Jul-2024
  • (2023)MirrorTaint: Practical Non-Intrusive Dynamic Taint Tracking for JVM-Based Microservice SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00210(2514-2526)Online publication date: 14-May-2023
  • (2023)U-VIP-SLAM: Underwater Visual-Inertial-Pressure SLAM for Navigation of Turbid and Dynamic EnvironmentsArabian Journal for Science and Engineering10.1007/s13369-023-07906-6Online publication date: 27-May-2023
  • (2022)ConcSpectre: Be Aware of Forthcoming Malware Hidden in Concurrent ProgramsIEEE Transactions on Reliability10.1109/TR.2022.316269471:2(1174-1188)Online publication date: Jun-2022
  • (2022)DisTA: Generic Dynamic Taint Tracking for Java-Based Distributed Systems2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN53405.2022.00060(547-558)Online publication date: Jun-2022
  • (2022)Dr.PathFinder: hybrid fuzzing with deep reinforcement concolic execution toward deeper path-first searchNeural Computing and Applications10.1007/s00521-022-07008-834:13(10731-10750)Online publication date: 1-Jul-2022
  • (2021)Canary: practical static detection of inter-thread value-flow bugsProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454099(1126-1140)Online publication date: 19-Jun-2021
  • (2020)Tell You a Definite Answer: Whether Your Data is Tainted During Thread SchedulingIEEE Transactions on Software Engineering10.1109/TSE.2018.287166646:9(916-931)Online publication date: 1-Sep-2020
  • (2019)NeuralTaint: A Key Segment Marking Tool Based on Neural NetworkIEEE Access10.1109/ACCESS.2019.29156817(68786-68798)Online publication date: 2019
  • (2018)Angora: Efficient Fuzzing by Principled Search2018 IEEE Symposium on Security and Privacy (SP)10.1109/SP.2018.00046(711-725)Online publication date: May-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media