skip to main content
10.1145/3192366.3192418acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Artifacts Evaluated & Functional

Pinpoint: fast and precise sparse value flow analysis for million lines of code

Published:11 June 2018Publication History

ABSTRACT

When dealing with millions of lines of code, we still cannot have the cake and eat it: sparse value-flow analysis is powerful in checking source-sink problems, but existing work cannot escape from the “pointer trap” – a precise points-to analysis limits its scalability and an imprecise one seriously undermines its precision. We present Pinpoint, a holistic approach that decomposes the cost of high-precision points-to analysis by precisely discovering local data dependence and delaying the expensive inter-procedural analysis through memorization. Such memorization enables the on-demand slicing of only the necessary inter-procedural data dependence and path feasibility queries, which are then solved by a costly SMT solver. Experiments show that Pinpoint can check programs such as MySQL (around 2 million lines of code) within 1.5 hours. The overall false positive rate is also very low (14.3% - 23.6%). Pinpoint has discovered over forty real bugs in mature and extensively checked open source systems. And the implementation of Pinpoint and all experimental results are freely available.

Skip Supplemental Material Section

Supplemental Material

p693-shi.webm

webm

102.6 MB

References

  1. Alex Aiken, Suhabe Bugrara, Isil Dillig, Thomas Dillig, Brian Hackett, and Peter Hawkins. 2006. The Saturn Program Analysis System. Stanford University.Google ScholarGoogle Scholar
  2. Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. Acm Sigplan Notices 49, 6 (2014), 259–269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Babic and A. Hu. 2008. Calysto: Scalable and Precise Extended Static Checking. In 2008 ACM/IEEE 30th International Conference on Software Engineering (ICSE 2008). IEEE, 211–220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Thomas Ball and Sriram K. Rajamani. 2002. The SLAM Project: Debugging System Software via Static Analysis. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’02). ACM, 1–3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM 53, 2 (2010), 66–75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Frederick E Boland Jr and Paul E Black. 2012. The Juliet 1.1 C/C++ and Java Test Suite. Computer (IEEE Computer) 45, 10 (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Juan Caballero, Gustavo Grieco, Mark Marron, and Antonio Nappa. 2012. Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. ACM, 133–143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sagar Chaki, Edmund M Clarke, Alex Groce, Somesh Jha, and Helmut Veith. 2004. Modular verification of software components in C. IEEE Transactions on Software Engineering 30, 6 (2004), 388–402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sigmund Cherem, Lonnie Princehouse, and Radu Rugina. 2007. Practical Memory Leak Detection Using Guarded Value-flow Analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’07). ACM, 480–491. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chia Yuan Cho, Vijay D’Silva, and Dawn Song. 2013. Blitz: Compositional bounded model checking for real-world programs. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on. IEEE, 136–146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Edmund Clarke, Daniel Kroening, Natasha Sharygina, and Karen Yorav. 2004. Predicate Abstraction of ANSI-C Programs Using SAT. Formal Methods in System Design 25, 2 (2004), 105–127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Edmund Clarke, Daniel Kroening, and Karen Yorav. 2003. Behavioral consistency of C and Verilog programs using bounded model checking. In Proceedings of the 40th annual Design Automation Conference. ACM, 368–371. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Manuvir Das, Sorin Lerner, and Mark Seigle. 2002. ESP: Path-sensitive Program Verification in Polynomial Time. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI ’02). ACM, 57–68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337–340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jeffrey Dean, David Grove, and Craig Chambers. 1995. Optimization of object-oriented programs using static class hierarchy analysis. In European Conference on Object-Oriented Programming. Springer, 77– 101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. David Dewey, Bradley Reaves, and Patrick Traynor. 2015. Uncovering Use-After-Free Conditions in Compiled Code. In Availability, Reliability and Security (ARES), 2015 10th International Conference on. IEEE, 90–99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Isil Dillig, Thomas Dillig, and Alex Aiken. 2008. Sound, complete and scalable path-sensitive analysis. In ACM SIGPLAN Notices, Vol. 43. ACM, 270–280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Isil Dillig, Thomas Dillig, Alex Aiken, and Mooly Sagiv. 2011. Precise and compact modular procedure summaries for heap manipulating programs. In ACM SIGPLAN Notices, Vol. 46. ACM, 567–577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lisa Nguyen Quang Do, Karim Ali, Benjamin Livshits, Eric Bodden, Justin Smith, and Emerson Murphy-Hill. 2017. Just-in-time static analysis. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 307–317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. N. Dor, S. Adams, M. Das, and Z. Yang. 2004. Software Validation via scalable path-sensitive value flow analysis. In Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’04). ACM, 12–22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Josselin Feist, Laurent Mounier, and Marie-Laure Potet. 2014. Statically detecting use after free on binary code. Journal of Computer Virology and Hacking Techniques 10, 3 (2014), 211–217.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The Program Dependence Graph and Its Use in Optimization. ACM Trans. Program. Lang. Syst. 9, 3 (1987), 319–349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Neville Grech and Yannis Smaragdakis. 2017. P/Taint: Unified Pointsto and Taint Analysis. Proc. ACM Program. Lang. 1, OOPSLA (2017), 102:1–102:28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Samuel Guyer and Calvin Lin. 2003. Client-driven pointer analysis. Static Analysis (2003), 1073–1073. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Samuel Z Guyer and Calvin Lin. 2005. Error checking with clientdriven pointer analysis. Science of Computer Programming 58, 1-2 (2005), 83–114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Nevin Heintze and Olivier Tardieu. 2001. Demand-driven pointer analysis. In ACM SIGPLAN Notices, Vol. 36. ACM, 24–34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Grégoire Sutre. 2002. Lazy Abstraction. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’02). ACM, 58–70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michael Hind. 2001. Pointer analysis: Haven’t we solved this problem yet?. In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. ACM, 54–61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. David Hovemeyer and William Pugh. 2007. Finding more null pointer bugs, but not too many. In Proceedings of the 7th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. ACM, 9–14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. David Hovemeyer, Jaime Spacco, and William Pugh. 2005. Evaluating and tuning a static analysis to find null pointer bugs. In ACM SIGSOFT Software Engineering Notes, Vol. 31. ACM, 13–19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. James C King. 1976. Symbolic execution and program testing. Commun. ACM 19, 7 (1976), 385–394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization. IEEE, 75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making context-sensitive points-to analysis with heap cloning practical for the real world. ACM SIGPLAN Notices 42, 6 (2007), 278–289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z Guyer, Uday P Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: a manifesto. Commun. ACM 58, 2 (2015), 44–46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. V Benjamin Livshits and Monica S Lam. 2003. Tracking pointers with path and context sensitivity for bug detection in C programs. ACM SIGSOFT Software Engineering Notes 28, 5 (2003), 317–326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Scott McPeak, Charles-Henri Gros, and Murali Krishna Ramanathan. 2013. Scalable and incremental software bug detection. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, 554–564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Nomair A Naeem and Ondrej Lhoták. 2011. Faster Alias Set Analysis Using Summaries.. In CC. Springer, 82–103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Hakjoo Oh, Kihong Heo, Wonchan Lee, Woosuk Lee, and Kwangkeun Yi. 2012. Design and implementation of sparse global analyses for C-like languages. In ACM SIGPLAN Notices, Vol. 47. ACM, 229–238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Thomas Reps, Susan Horwitz, and Mooly Sagiv. 1995. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM, 49–61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Wolf-Steffen Rödiger. 2011. Merging Static Analysis and model checking for improved security vulnerability detection. Ph.D. Dissertation. Master thesis, Dept. of Com. Sc. Augsburg University.Google ScholarGoogle Scholar
  41. Diptikalyan Saha and CR Ramakrishnan. 2005. Incremental and demand-driven points-to analysis using logic programming. In Proceedings of the 7th ACM SIGPLAN international conference on Principles and practice of declarative programming. ACM, 117–128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. LA Sandra. 1994. PHB Practical Handbook of Curve Fitting.Google ScholarGoogle Scholar
  43. G Snelting, T Robschink, and J Krinke. 2006. Efficient Path Conditions in Dependence Graphs for Software Safety Analysis. ACM Transactions on Software Engineering and Methodology (TOSEM) 15, 4 (2006), 410– 457. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodík. 2005. Demand-driven points-to analysis for Java. In ACM SIGPLAN Notices, Vol. 40. ACM, 59–76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yulei Sui and Jingling Xue. 2016. SVF: Interprocedural static value-flow analysis in LLVM. In Proceedings of the 25th International Conference on Compiler Construction. ACM, 265–266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Yulei Sui and Jingling Xue. 2016. SVF: Interprocedural Static Value-flow Analysis in LLVM. In Proceedings of the 25th International Conference on Compiler Construction (CC 2016). ACM, 265–266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Y. Sui, D. Ye, and J. Xue. 2014. Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis. IEEE Transactions on Software Engineering 40, 2 (2014), 107–122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Peng Tu and David Padua. 1995. Efficient building and placing of gating functions. ACM SIGPLAN Notices 30, 6 (1995), 47–55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Mark N Wegman and F Kenneth Zadeck. 1991. Constant propagation with conditional branches. ACM Transactions on Programming Languages and Systems (TOPLAS) 13, 2 (1991), 181–210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. John Whaley and Monica S Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In ACM SIGPLAN Notices, Vol. 39. ACM, 131–144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Robert P Wilson and Monica S Lam. 1995. Efficient context-sensitive pointer analysis for C programs. Vol. 30. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Yichen Xie and Alex Aiken. 2005. Context-and path-sensitive memory leak detection. In ACM SIGSOFT Software Engineering Notes, Vol. 30. ACM, 115–125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Yichen Xie and Alex Aiken. 2005. Scalable Error Detection Using Boolean Satisfiability. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’05). ACM, 351–363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Dacong Yan, Guoqing Xu, and Atanas Rountev. 2011. Demand-driven context-sensitive alias analysis for Java. In Proceedings of the 2011 International Symposium on Software Testing and Analysis. ACM, 155– 165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Xin Zheng and Radu Rugina. 2008. Demand-driven alias analysis for C. ACM SIGPLAN Notices 43, 1 (2008), 197–208. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Pinpoint: fast and precise sparse value flow analysis for million lines of code

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation
      June 2018
      825 pages
      ISBN:9781450356985
      DOI:10.1145/3192366

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate406of2,067submissions,20%

      Upcoming Conference

      PLDI '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader