article

Open Access

CCured: type-safe retrofitting of legacy software

Authors:
George C. Necula

University of California, Berkeley, Berkeley, CA

University of California, Berkeley, Berkeley, CA
View Profile

,
Jeremy Condit

University of California, Berkeley, Berkeley, CA

University of California, Berkeley, Berkeley, CA
View Profile

,
Matthew Harren

University of California, Berkeley, Berkeley, CA

University of California, Berkeley, Berkeley, CA
View Profile

,
Scott McPeak

University of California, Berkeley, Berkeley, CA

University of California, Berkeley, Berkeley, CA
View Profile

,
Westley Weimer

University of California, Berkeley, Berkeley, CA

University of California, Berkeley, Berkeley, CA
View Profile

ACM Transactions on Programming Languages and Systems Volume 27 Issue 3pp 477–526https://doi.org/10.1145/1065887.1065892

Published:01 May 2005Publication History

ACM Transactions on Programming Languages and Systems

Abstract

This article describes CCured, a program transformation system that adds type safety guarantees to existing C programs. CCured attempts to verify statically that memory errors cannot occur, and it inserts run-time checks where static verification is insufficient.CCured extends C's type system by separating pointer types according to their usage, and it uses a surprisingly simple type inference algorithm that is able to infer the appropriate pointer kinds for existing C programs. CCured uses physical subtyping to recognize and verify a large number of type casts at compile time. Additional type casts are verified using run-time type information. CCured uses two instrumentation schemes, one that is optimized for performance and one in which metadata is stored in a separate data structure whose shape mirrors that of the original user data. This latter scheme allows instrumented programs to invoke external functions directly on the program's data without the use of a wrapper function.We have used CCured on real-world security-critical network daemons to produce instrumented versions without memory-safety vulnerabilities, and we have found several bugs in these programs. The instrumented code is efficient enough to be used in day-to-day operations.

References

Abadi, M., Cardelli, L., Pierce, B., and Plotkin, G. 1991. Dynamic typing in a statically typed language. ACM Trans. Prog. Lang. Syst. 13, 2 (April), 237--268. Google Scholar
Austin, T. M., Breach, S. E., and Sohi, G. S. 1994. Efficient detection of all pointer and array access errors. SIGPLAN Not. 29, 6 (June), 290--301. Also in Proceedings of the ACM SIGPLAN '94 Conference on Programming Language Design and Implementation. Google Scholar
Boehm, H.-J. and Weiser, M. 1988. Garbage collection in an uncooperative environment. Softw.---Pract. Exper. 18, 9, 807--820. Google Scholar
Cardelli, L., Donahue, J., Glassman, L., Jordan, M., Kalsow, B., and Nelson, G. 1989. Modula-3 report (rev.). SRC Research rep. 52. Digital Equipment Corporation Systems Research Center, Palo alto, CA.Google Scholar
Carlisle, M. C. 1996. Olden: Parallelizing programs with dynamic data structures on distributed-memory machines. Ph.D. dissertation. Princeton University Department of Computer Science, Princeton, NJ. Google Scholar
Cartwright, R. and Fagan, M. 1991. Soft typing. In Proceedings of the '91 Conference on Programming Language Design and Implementation. 278--292. Google Scholar
CERT Coordination Center. 2003. CERT Advisory CA-2003-12: Buffer overflow in sendmail. Web site: http://www.cert.org/advisories/CA-2003-12.html.Google Scholar
Chandra, S. and Reps, T. 1999. Physical type checking for C. In Proceedings of the ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. Software Engeneering Notes (SEN), vol. 24.5. ACM Press, New York, NY, 66--75. Google Scholar
Condit, J., Harren, M., Necula, G. C., McPeak, S., and Weimer, W. 2003. CCured in the real world. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation. ACM, Press, New York, NY, 232--244. Google Scholar
Crary, K., Weirich, S., and Morrisett, J. G. 1998. Intensional polymorphism in type-erasure semantics. In Proceedings of the International Conference on Functional Programming. 301--312. Google Scholar
Das, M. 2000. Unification-based pointer analysis with directional assignments. In Proceedings of the Conference on Programming Language Design and Implementation. Google Scholar
Duggan, D. 1999. Dynamic typing for distributed programming in polymorphic languages. ACM Trans. Prog. Lang. Syst. 21, 1, 11--45. Google Scholar
Evans, D. 1996. Static detection of dynamic memory errors. ACM SIGPLAN Not. 31, 5, 44--53. Google Scholar
Harper, R. and Morrisett, G. 1995. Compiling polymorphism using intensional type analysis. In Conference Record of POPL '95: 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Francisco, CA). 130--141. Google Scholar
Hastings, R. and Joyce, B. 1991. Purify: Fast detection of memory leaks and access errors. In Proceedings of the Usenix Winter 1992 Technical Conference. Usenix Association, Berkeley, CA, 125--138.Google Scholar
Henglein, F. 1992. Global tagging optimization by type inference. In Proceedings of the 1992 ACM Conference on LISP and Functional Programming. 205--215. Google Scholar
Henglein, F. and Jorgensen, J. 1994. Formally optimal boxing. In Proceedings of the 21th Annual ACM Symposium on Principles of Programming Languages. ACM, 213--226. Google Scholar
Hirzel, M. 2000. Effectiveness of garbage collection and explicit deallocation. M.S. thesis. University of Colorado at Boulder, Boulder, CO.Google Scholar
ISO/IEC. 1999. ISO/IEC 9899:1999(E) Programming Languages---C. ISO/IEC, Geneva, Switzerland. Web site: www.iso.ch.Google Scholar
Jagannathan, S. and Wright, A. 1995. Effective flow analysis for avoiding run-time checks. In Proceedings of the Second International Static Analysis Symposium, Vol. 983. Springer-Verlag, Berlin, Germany, 207--224. Google Scholar
Jim, T., Morrisett, G., Grossman, D., Hicks, M., Cheney, J., and Wang, Y. 2002. Cyclone: A safe dialect of C. In Proceedings of the USENIX Annual Technical Conference (Monetery, CA). Google Scholar
Jones, R. W. M. and Kelly, P. H. J. 1997. Backwards-compatible bounds checking for arrays and pointers in C programs. In Proceedings of the Third International Workshop on Automatic Debugging (May). 13--26.Google Scholar
Kaufer, S., Lopez, R., and Pratap, S. 1988. Saber-C: An interpreter-based programming environment for the C language. In Proceedings of the Summer Usenix Conference. 161--171.Google Scholar
Kind, A. and Friedrich, H. 1993. A practical approach to type inference for EuLisp. Lisp Symbol. Computa. 6, 1/2, 159--176. Google Scholar
Lampson, B. 1983. A description of the Cedar language. Tech. rep. CSL-83-15. Xerox Palo Alto Research Center, Palo Alto, CA.Google Scholar
Liskov, B., Atkinson, R. R., Bloom, T., Moss, E. B., Schaffert, R., and Snyder, A. 1981. CLU Reference Manual. Springer-Verlag, Berlin, Germany. Google Scholar
Loginov, A., Yong, S., Horwitz, S., and Reps, T. 2001. Debugging via run-time type checking. In Proceedings of FASE 2001: Fundamental Approaches to Software Engineering. Google Scholar
Necula, G. C., McPeak, S., and Weimer, W. 2002a. CCured: Type-safe retrofitting of legacy code. In Proceedings of the 29th Annual ACM Symposium on Principles of Programming Languages. ACM, Press, New York, NY, 128--139. Google Scholar
Necula, G. C., McPeak, S., and Weimer, W. 2002b. CIL: Intermediate language and tools for the analysis of C programs. In Proceedings of the International Conference on Compiler Construction (Grenoble, France). 213--228. Available online from http://raw.cs.berkeley.edu/Papers/. Google Scholar
Patil, H. and Fischer, C. N. 1995. Efficient run-time monitoring using shadow processing. In Proceedings of the Conference on Automated and Algorithmic Debugging. 119--132.Google Scholar
Patil, H. and Fischer, C. N. 1997. Low-cost, concurrent checking of pointer and array accesses in C programs. Softw.---Pract. Exper. 27, 1 (Jan.), 87--110. Google Scholar
Ramalingam, G., Field, J., and Tip, F. 1999. Aggregate structure identification and its application to program analysis. In Proceedings of the Symposium on Principles of Programming Languages. 119--132. Google Scholar
Remy, D. and Vouillon, J. 1997. Objective ML: A simple object-oriented extension of ML. In Proceedings of the Symposium on Principles of Programming Languages. 40--53. Google Scholar
SecuriTeam.com. 2000. PHP3/PHP4 format string vulnerability. Web site: http://www.securiteam.com/securitynews/6O00T0K03O.html.Google Scholar
Seward, J. 2003. Valgrind, an open-source memory debugger for x86-GNU/Linux. Tech. rep. Available online at http://developer.kde.org/sewardj/.Google Scholar
Shields, M., Sheard, T., and Jones, S. L. P. 1998. Dynamic typing as staged type inference. In Proceedings of the Symposium on Principles of Programming Languages. 289--302. Google Scholar
Siff, M., Chandra, S., Ball, T., Kunchithapadam, K., and Reps, T. 1999. Coping with type casts in C. In 1999 ACM Foundations on Software Engineering Conference. Lecture Notes in Computer Science, vol. 1687. Springer-Verlag, Berlin, Germany, ACM Press, New York, NY, 180--198. Google Scholar
Smith, G. and Volpano, D. 1998. A sound polymorphic type system for a dialect of C. Sci. Comput. Prog. 32, 1--3, 49--72. Google Scholar
SPEC. 1995. Standard Performance Evaluation Corporation Benchmarks. Web site: http://www.spec.org/osg/cpu95/CINT95.Google Scholar
Steensgaard, B. 1996. Points-to analysis in almost linear time. In Proceedings of the Symposium on Principles of Programming Languages. 32--41. Google Scholar
Steffen, J. L. 1992. Adding run-time checking to the Portable C Compiler. Softw.---Pract. Exper. 22, 4 (Apr.), 305--316. Google Scholar
Thatte, S. 1990. Quasi-static typing. In Proceedings of the Conference record of the 17th ACM Symposium on Principles of Programming Languages (POPL). 367--381. Google Scholar
Wagner, D., Foster, J., Brewer, E., and Aiken, A. 2000. A first step toward automated detection of buffer overrun vulnerabilities. In Proceedings of the Network Distributed Systems Security Symposium. 1--15.Google Scholar
Wright, A. and Cartwright, R. 1997. A practical soft type system for Scheme. ACM Trans. Prog. Lang. Syst. 19, 1 (Jan.), 87--152. Google Scholar

Index Terms

Recommendations

CCured: type-safe retrofitting of legacy code
POPL '02: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages

In this paper we propose a scheme that combines type inference and run-time checking to make existing C programs type safe. We describe the CCured type system, which extends that of C by separating pointer types according to their usage. This type ...
Read More
CCured in the real world

CCured is a program transformation system that adds memory safety guarantees to C programs by verifying statically that memory errors cannot occur and by inserting run-time checks where static verification is insufficient.This paper addresses major ...
Read More
CCured: type-safe retrofitting of legacy code
Supplemental issue

In this paper we propose a scheme that combines type inference and run-time checking to make existing C programs type safe. We describe the CCured type system, which extends that of C by separating pointer types according to their usage. This type ...
Read More

Reviews

Reviewer: Hans J. Schneider

The authors treat C as a dynamically typed language, but optimize away most of the runtime checks. They distinguish between SAFE pointers, SEQ pointers involving pointer arithmetic, and WILD pointers requiring full runtime checks. The CCured system supports static type checking of programs annotated with these pointer qualifiers, but, in my opinion, the most important contribution is the system's application to existing C programs. Section 2 presents the type system of CCured, its translation into additional information available during runtime check, and the inference algorithm to discover the best qualifier for each pointer in legacy programs. The next section gives special attention to reducing the number of WILD pointers, by analyzing the use of casts between pointer types. Section 4 summarizes the operational semantics of a representative fragment of the language. In section 5, the authors consider some other troublesome features, such as unions, function pointers, heap allocation, and so on. Linking transformed programs with binary libraries is discussed in section 6. In many cases, it is sufficient to write a wrapper function. The system provides the user with such functions for the most commonly used functions from the C standard library. Sections 7 and 8 summarize some experiences. First, the authors describe the process of curing a program. Then, they present the experimental results of testing the system on real-world programs. Most programs take less than 50 percent longer to run when instrumented with CCured. An interesting point is that the tests discovered a number of bugs in well-known benchmark suites. A detailed discussion of related work, and 44 references, wraps up this clearly written paper. It can be recommended to people concerned with improving legacy system software.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Programming Languages and Systems Volume 27, Issue 3
May 2005
200 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/1065887
Issue’s Table of Contents

Copyright © 2005 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 2005
Published in toplas Volume 27, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Memory safety
libraries
pointer qualifier
subtyping
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 289
  Total Citations
  View Citations
- 2,106
  Total Downloads
- Downloads (Last 12 months)159
- Downloads (Last 6 weeks)16
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

CCured: type-safe retrofitting of legacy software

ACM Transactions on Programming Languages and Systems

Abstract

References

Cited By

Index Terms

Recommendations

CCured: type-safe retrofitting of legacy code

CCured in the real world

CCured: type-safe retrofitting of legacy code

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

CCured: type-safe retrofitting of legacy software

ACM Transactions on Programming Languages and Systems

Abstract

References

Cited By

Index Terms

Recommendations

CCured: type-safe retrofitting of legacy code

CCured in the real world

CCured: type-safe retrofitting of legacy code

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media