skip to main content
article
Open Access

CCured: type-safe retrofitting of legacy software

Published:01 May 2005Publication History
Skip Abstract Section

Abstract

This article describes CCured, a program transformation system that adds type safety guarantees to existing C programs. CCured attempts to verify statically that memory errors cannot occur, and it inserts run-time checks where static verification is insufficient.CCured extends C's type system by separating pointer types according to their usage, and it uses a surprisingly simple type inference algorithm that is able to infer the appropriate pointer kinds for existing C programs. CCured uses physical subtyping to recognize and verify a large number of type casts at compile time. Additional type casts are verified using run-time type information. CCured uses two instrumentation schemes, one that is optimized for performance and one in which metadata is stored in a separate data structure whose shape mirrors that of the original user data. This latter scheme allows instrumented programs to invoke external functions directly on the program's data without the use of a wrapper function.We have used CCured on real-world security-critical network daemons to produce instrumented versions without memory-safety vulnerabilities, and we have found several bugs in these programs. The instrumented code is efficient enough to be used in day-to-day operations.

References

  1. Abadi, M., Cardelli, L., Pierce, B., and Plotkin, G. 1991. Dynamic typing in a statically typed language. ACM Trans. Prog. Lang. Syst. 13, 2 (April), 237--268. Google ScholarGoogle Scholar
  2. Austin, T. M., Breach, S. E., and Sohi, G. S. 1994. Efficient detection of all pointer and array access errors. SIGPLAN Not. 29, 6 (June), 290--301. Also in Proceedings of the ACM SIGPLAN '94 Conference on Programming Language Design and Implementation. Google ScholarGoogle Scholar
  3. Boehm, H.-J. and Weiser, M. 1988. Garbage collection in an uncooperative environment. Softw.---Pract. Exper. 18, 9, 807--820. Google ScholarGoogle Scholar
  4. Cardelli, L., Donahue, J., Glassman, L., Jordan, M., Kalsow, B., and Nelson, G. 1989. Modula-3 report (rev.). SRC Research rep. 52. Digital Equipment Corporation Systems Research Center, Palo alto, CA.Google ScholarGoogle Scholar
  5. Carlisle, M. C. 1996. Olden: Parallelizing programs with dynamic data structures on distributed-memory machines. Ph.D. dissertation. Princeton University Department of Computer Science, Princeton, NJ. Google ScholarGoogle Scholar
  6. Cartwright, R. and Fagan, M. 1991. Soft typing. In Proceedings of the '91 Conference on Programming Language Design and Implementation. 278--292. Google ScholarGoogle Scholar
  7. CERT Coordination Center. 2003. CERT Advisory CA-2003-12: Buffer overflow in sendmail. Web site: http://www.cert.org/advisories/CA-2003-12.html.Google ScholarGoogle Scholar
  8. Chandra, S. and Reps, T. 1999. Physical type checking for C. In Proceedings of the ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. Software Engeneering Notes (SEN), vol. 24.5. ACM Press, New York, NY, 66--75. Google ScholarGoogle Scholar
  9. Condit, J., Harren, M., Necula, G. C., McPeak, S., and Weimer, W. 2003. CCured in the real world. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation. ACM, Press, New York, NY, 232--244. Google ScholarGoogle Scholar
  10. Crary, K., Weirich, S., and Morrisett, J. G. 1998. Intensional polymorphism in type-erasure semantics. In Proceedings of the International Conference on Functional Programming. 301--312. Google ScholarGoogle Scholar
  11. Das, M. 2000. Unification-based pointer analysis with directional assignments. In Proceedings of the Conference on Programming Language Design and Implementation. Google ScholarGoogle Scholar
  12. Duggan, D. 1999. Dynamic typing for distributed programming in polymorphic languages. ACM Trans. Prog. Lang. Syst. 21, 1, 11--45. Google ScholarGoogle Scholar
  13. Evans, D. 1996. Static detection of dynamic memory errors. ACM SIGPLAN Not. 31, 5, 44--53. Google ScholarGoogle Scholar
  14. Harper, R. and Morrisett, G. 1995. Compiling polymorphism using intensional type analysis. In Conference Record of POPL '95: 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Francisco, CA). 130--141. Google ScholarGoogle Scholar
  15. Hastings, R. and Joyce, B. 1991. Purify: Fast detection of memory leaks and access errors. In Proceedings of the Usenix Winter 1992 Technical Conference. Usenix Association, Berkeley, CA, 125--138.Google ScholarGoogle Scholar
  16. Henglein, F. 1992. Global tagging optimization by type inference. In Proceedings of the 1992 ACM Conference on LISP and Functional Programming. 205--215. Google ScholarGoogle Scholar
  17. Henglein, F. and Jorgensen, J. 1994. Formally optimal boxing. In Proceedings of the 21th Annual ACM Symposium on Principles of Programming Languages. ACM, 213--226. Google ScholarGoogle Scholar
  18. Hirzel, M. 2000. Effectiveness of garbage collection and explicit deallocation. M.S. thesis. University of Colorado at Boulder, Boulder, CO.Google ScholarGoogle Scholar
  19. ISO/IEC. 1999. ISO/IEC 9899:1999(E) Programming Languages---C. ISO/IEC, Geneva, Switzerland. Web site: www.iso.ch.Google ScholarGoogle Scholar
  20. Jagannathan, S. and Wright, A. 1995. Effective flow analysis for avoiding run-time checks. In Proceedings of the Second International Static Analysis Symposium, Vol. 983. Springer-Verlag, Berlin, Germany, 207--224. Google ScholarGoogle Scholar
  21. Jim, T., Morrisett, G., Grossman, D., Hicks, M., Cheney, J., and Wang, Y. 2002. Cyclone: A safe dialect of C. In Proceedings of the USENIX Annual Technical Conference (Monetery, CA). Google ScholarGoogle Scholar
  22. Jones, R. W. M. and Kelly, P. H. J. 1997. Backwards-compatible bounds checking for arrays and pointers in C programs. In Proceedings of the Third International Workshop on Automatic Debugging (May). 13--26.Google ScholarGoogle Scholar
  23. Kaufer, S., Lopez, R., and Pratap, S. 1988. Saber-C: An interpreter-based programming environment for the C language. In Proceedings of the Summer Usenix Conference. 161--171.Google ScholarGoogle Scholar
  24. Kind, A. and Friedrich, H. 1993. A practical approach to type inference for EuLisp. Lisp Symbol. Computa. 6, 1/2, 159--176. Google ScholarGoogle Scholar
  25. Lampson, B. 1983. A description of the Cedar language. Tech. rep. CSL-83-15. Xerox Palo Alto Research Center, Palo Alto, CA.Google ScholarGoogle Scholar
  26. Liskov, B., Atkinson, R. R., Bloom, T., Moss, E. B., Schaffert, R., and Snyder, A. 1981. CLU Reference Manual. Springer-Verlag, Berlin, Germany. Google ScholarGoogle Scholar
  27. Loginov, A., Yong, S., Horwitz, S., and Reps, T. 2001. Debugging via run-time type checking. In Proceedings of FASE 2001: Fundamental Approaches to Software Engineering. Google ScholarGoogle Scholar
  28. Necula, G. C., McPeak, S., and Weimer, W. 2002a. CCured: Type-safe retrofitting of legacy code. In Proceedings of the 29th Annual ACM Symposium on Principles of Programming Languages. ACM, Press, New York, NY, 128--139. Google ScholarGoogle Scholar
  29. Necula, G. C., McPeak, S., and Weimer, W. 2002b. CIL: Intermediate language and tools for the analysis of C programs. In Proceedings of the International Conference on Compiler Construction (Grenoble, France). 213--228. Available online from http://raw.cs.berkeley.edu/Papers/. Google ScholarGoogle Scholar
  30. Patil, H. and Fischer, C. N. 1995. Efficient run-time monitoring using shadow processing. In Proceedings of the Conference on Automated and Algorithmic Debugging. 119--132.Google ScholarGoogle Scholar
  31. Patil, H. and Fischer, C. N. 1997. Low-cost, concurrent checking of pointer and array accesses in C programs. Softw.---Pract. Exper. 27, 1 (Jan.), 87--110. Google ScholarGoogle Scholar
  32. Ramalingam, G., Field, J., and Tip, F. 1999. Aggregate structure identification and its application to program analysis. In Proceedings of the Symposium on Principles of Programming Languages. 119--132. Google ScholarGoogle Scholar
  33. Remy, D. and Vouillon, J. 1997. Objective ML: A simple object-oriented extension of ML. In Proceedings of the Symposium on Principles of Programming Languages. 40--53. Google ScholarGoogle Scholar
  34. SecuriTeam.com. 2000. PHP3/PHP4 format string vulnerability. Web site: http://www.securiteam.com/securitynews/6O00T0K03O.html.Google ScholarGoogle Scholar
  35. Seward, J. 2003. Valgrind, an open-source memory debugger for x86-GNU/Linux. Tech. rep. Available online at http://developer.kde.org/sewardj/.Google ScholarGoogle Scholar
  36. Shields, M., Sheard, T., and Jones, S. L. P. 1998. Dynamic typing as staged type inference. In Proceedings of the Symposium on Principles of Programming Languages. 289--302. Google ScholarGoogle Scholar
  37. Siff, M., Chandra, S., Ball, T., Kunchithapadam, K., and Reps, T. 1999. Coping with type casts in C. In 1999 ACM Foundations on Software Engineering Conference. Lecture Notes in Computer Science, vol. 1687. Springer-Verlag, Berlin, Germany, ACM Press, New York, NY, 180--198. Google ScholarGoogle Scholar
  38. Smith, G. and Volpano, D. 1998. A sound polymorphic type system for a dialect of C. Sci. Comput. Prog. 32, 1--3, 49--72. Google ScholarGoogle Scholar
  39. SPEC. 1995. Standard Performance Evaluation Corporation Benchmarks. Web site: http://www.spec.org/osg/cpu95/CINT95.Google ScholarGoogle Scholar
  40. Steensgaard, B. 1996. Points-to analysis in almost linear time. In Proceedings of the Symposium on Principles of Programming Languages. 32--41. Google ScholarGoogle Scholar
  41. Steffen, J. L. 1992. Adding run-time checking to the Portable C Compiler. Softw.---Pract. Exper. 22, 4 (Apr.), 305--316. Google ScholarGoogle Scholar
  42. Thatte, S. 1990. Quasi-static typing. In Proceedings of the Conference record of the 17th ACM Symposium on Principles of Programming Languages (POPL). 367--381. Google ScholarGoogle Scholar
  43. Wagner, D., Foster, J., Brewer, E., and Aiken, A. 2000. A first step toward automated detection of buffer overrun vulnerabilities. In Proceedings of the Network Distributed Systems Security Symposium. 1--15.Google ScholarGoogle Scholar
  44. Wright, A. and Cartwright, R. 1997. A practical soft type system for Scheme. ACM Trans. Prog. Lang. Syst. 19, 1 (Jan.), 87--152. Google ScholarGoogle Scholar

Index Terms

  1. CCured: type-safe retrofitting of legacy software

                    Recommendations

                    Reviews

                    Hans J. Schneider

                    The authors treat C as a dynamically typed language, but optimize away most of the runtime checks. They distinguish between SAFE pointers, SEQ pointers involving pointer arithmetic, and WILD pointers requiring full runtime checks. The CCured system supports static type checking of programs annotated with these pointer qualifiers, but, in my opinion, the most important contribution is the system's application to existing C programs. Section 2 presents the type system of CCured, its translation into additional information available during runtime check, and the inference algorithm to discover the best qualifier for each pointer in legacy programs. The next section gives special attention to reducing the number of WILD pointers, by analyzing the use of casts between pointer types. Section 4 summarizes the operational semantics of a representative fragment of the language. In section 5, the authors consider some other troublesome features, such as unions, function pointers, heap allocation, and so on. Linking transformed programs with binary libraries is discussed in section 6. In many cases, it is sufficient to write a wrapper function. The system provides the user with such functions for the most commonly used functions from the C standard library. Sections 7 and 8 summarize some experiences. First, the authors describe the process of curing a program. Then, they present the experimental results of testing the system on real-world programs. Most programs take less than 50 percent longer to run when instrumented with CCured. An interesting point is that the tests discovered a number of bugs in well-known benchmark suites. A detailed discussion of related work, and 44 references, wraps up this clearly written paper. It can be recommended to people concerned with improving legacy system software.

                    Access critical reviews of Computing literature here

                    Become a reviewer for Computing Reviews.

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    Full Access

                    • Published in

                      cover image ACM Transactions on Programming Languages and Systems
                      ACM Transactions on Programming Languages and Systems  Volume 27, Issue 3
                      May 2005
                      200 pages
                      ISSN:0164-0925
                      EISSN:1558-4593
                      DOI:10.1145/1065887
                      Issue’s Table of Contents

                      Copyright © 2005 ACM

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 1 May 2005
                      Published in toplas Volume 27, Issue 3

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • article

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader