skip to main content
10.1145/2815400.2815422acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Public Access

Cross-checking semantic correctness: the case of finding file system bugs

Published: 04 October 2015 Publication History

Abstract

Today, systems software is too complex to be bug-free. To find bugs in systems software, developers often rely on code checkers, like Linux's Sparse. However, the capability of existing tools used in commodity, large-scale systems is limited to finding only shallow bugs that tend to be introduced by simple programmer mistakes, and so do not require a deep understanding of code to find them. Unfortunately, the majority of bugs as well as those that are difficult to find are semantic ones, which violate high-level rules or invariants (e.g., missing a permission check). Thus, it is difficult for code checkers lacking the understanding of a programmer's true intention to reason about semantic correctness.
To solve this problem, we present Juxta, a tool that automatically infers high-level semantics directly from source code. The key idea in Juxta is to compare and contrast multiple existing implementations that obey latent yet implicit high-level semantics. For example, the implementation of open() at the file system layer expects to handle an out-of-space error from the disk in all file systems. We applied Juxta to 54 file systems in the stock Linux kernel (680K LoC), found 118 previously unknown semantic bugs (one bug per 5.8K LoC), and provided corresponding patches to 39 different file systems, including mature, popular ones like ext4, btrfs, XFS, and NFS. These semantic bugs are not easy to locate, as all the ones found by Juxta have existed for over 6.2 years on average. Not only do our empirical results look promising, but the design of Juxta is generic enough to be extended easily beyond file systems to any software that has multiple implementations, like Web browsers or protocols at the same layer of a network stack.

Supplementary Material

MP4 File (p361.mp4)

References

[1]
Skipped files with -listed-incremental after rename, 2003. http://osdir.com/ml/gnu.tar.bugs/2003-10/msg00013.html.
[2]
Fix update of mtime and ctime on rename, 2008. http://linux-ext4.vger.kernel.narkive.com/Cc13bI74/patch-ext3-fix-update-of-mtime-and-ctime-on-rename.
[3]
mkdir(), 2013. The IEEE and The Open Group, The Open Group Base Specifications Issue 7, IEEE Std 1003.1, 2013 Edition http://pubs.opengroup.org/onlinepubs/9699919799/functions/mkdir.html.
[4]
Checker developer manual, 2015. http://clang-analyzer.llvm.org/checker_dev_manual.html#idea.
[5]
Ashcraft, K., and Engler, D. Using programmer-written compiler extensions to catch security holes. In Proceedings of the 23rd IEEE Symposium on Security and Privacy (Oakland) (Oakland, CA, May 2002), pp. 143--160.
[6]
Avizienis, A. The N-Version approach to fault-tolerant software. IEEE Transactions of Software Engineering 11, 12 (Dec. 1985), 1491--1501.
[7]
Ball, T., Bounimova, E., Cook, B., Levin, V., Lichtenberg, J., McGarvey, C., Ondrusek, B., Rajamani, S. K., and Ustuner, A. Thorough static analysis of device drivers. In Proceedings of the ACM EuroSys Conference (Leuven, Belgium, Apr. 2006), pp. 73--85.
[8]
Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B., Hallem, S., Henri-Gros, C., Kamsky, A., McPeak, S., and Engler, D. A few billion lines of code later: Using static analysis to find bugs in the real world. Communications of the ACM 53, 2 (Feb. 2010), 66--75.
[9]
Brumley, D., Caballero, J., Liang, Z., Newsome, J., and Song, D. Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation. In Proceedings of the 16th Usenix Security Symposium (Security) (Boston, MA, Aug. 2007), pp. 15:1--15:16.
[10]
Brumley, D., Wang, H., Jha, S., and Song, D. Creating vulnerability signatures using weakest preconditions. In Proceedings of the 20th IEEE Computer Security Foundations Symposium (Washington, DC, USA, 2007), CSF '07, IEEE Computer Society, pp. 311--325.
[11]
Brummayer, R., and Biere, A. Boolector: An efficient SMT solver for bit-vectors and arrays. In Tools and Algorithms for the Construction and Analysis of Systems. Springer, 2009, pp. 174--177.
[12]
Bucur, S., Ureche, V., Zamfir, C., and Candea, G. Parallel symbolic execution for automated real-world software testing. In Proceedings of the ACM EuroSys Conference (Salzburg, Austria, Apr. 2011), pp. 183--198.
[13]
Cadar, C., Dunbar, D., and Engler, D. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th Symposium on Operating Systems Design and Implementation (OSDI) (San Diego, CA, Dec. 2008), pp. 209--224.
[14]
Cadar, C., Ganesh, V., Pawlowski, P. M., Dill, D. L., and Engler, D. R. EXE: Automatically generating inputs of death. In Proceedings of the 13th ACM Conference on Computer and Communications Security (Alexandria, VA, Oct.--Nov. 2006), pp. 322--335.
[15]
Chen, H., Cutler, C., Kim, T., Mao, Y., Wang, X., Zeldovich, N., and Kaashoek, M. F. Security bugs in embedded interpreters. In Proceedings of the 4th Asia-Pacific Workshop on Systems (APSys) (2013), pp. 17:1--17:7.
[16]
Chen, H., Ziegler, D., Chlipala, A., Kaashoek, M. F., Kohler, E., and Zeldovich, N. Towards certified storage systems. In Proceedings of the 15th Workshop on Hot Topics in Operating Systems (HotOS) (May 2015).
[17]
Chou, A., Yang, J., Chelf, B., Hallem, S., and Engler, D. An empirical study of operating systems errors. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP) (Chateau Lake Louise, Banff, Canada, Oct. 2001), pp. 73--88.
[18]
Corbet, J., Kroah-Hartman, G., and McPherson, A. Linux Kernel Development: How Fast is it Going, Who is Doing It, What Are They Doing and Who is Sponsoring the Work, 2015. http://www.linuxfoundation.org/publications/linuxfoundation/who-writes-linux-2015.
[19]
Corbett, J. C., Dwyer, M. B., Hatcliff, J., Laubach, S., Pasareanu, C. S., and Zheng, H. Bandera: Extracting finite-state models from Java source code. In Proceedings of the 22nd international conference on software engineering (ICSE) (2000), pp. 439--448.
[20]
Cui, H., Hu, G., Wu, J., and Yang, J. Verifying systems rules using rule-directed symbolic execution. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (Houston, TX, Mar. 2013), pp. 329--342.
[21]
Das, M., Lerner, S., and Seigle, M. ESP: Path-sensitive program verification in polynomial time. In Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation (Berlin, Germany, June 2002), pp. 57--68.
[22]
De Moura, L., and Bjørner, N. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems. Springer, 2008, pp. 337--340.
[23]
Dijkstra, E. W. A discipline of programming, vol. 1. Prentice-Hall Englewood Cliffs, 1976.
[24]
Dillig, I., Dillig, T., and Aiken, A. Static error detection using semantic inconsistency inference. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (San Diego, CA, June 2007), pp. 435--445.
[25]
Dittmer, M. S., and Tripunitara, M. V. The UNIX process identity crisis: A standards-driven approach to setuid. In Proceedings of the 21st ACM Conference on Computer and Communications Security (Scottsdale, Arizona, Nov. 2014), pp. 1391--1402.
[26]
Duda, R. O., Hart, P. E., and Stork, D. G. Pattern classification. John Wiley & Sons, 2012.
[27]
ECMA International. ECMAScript Language Specification, June 2011. http://www.ecmascript.org/docs.php.
[28]
Engler, D., Chelf, B., Chou, A., and Hallem, S. Checking system rules using system-specific, programmer-written compiler extensions. In Proceedings of the 4th Symposium on Operating Systems Design and Implementation (OSDI) (San Diego, CA, Oct. 2000).
[29]
Engler, D., Chen, D. Y., Hallem, S., Chou, A., and Chelf, B. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP) (Chateau Lake Louise, Banff, Canada, Oct. 2001), pp. 57--72.
[30]
Fryer, D., Sun, K., Mahmood, R., Cheng, T., Benjamin, S., Goel, A., and Brown, A. D. Recon: Verifying file system consistency at runtime. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST) (2012).
[31]
Gooch, R. Overview of the linux virtual file system, 2007. https://www.kernel.org/doc/Documentation/filesystems/vfs.txt.
[32]
Gunawi, H. S., Rubio-González, C., Arpaci-Dusseau, A. C., Arpaci-Dussea, R. H., and Liblit, B. EIO: Error handling is occasionally correct. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST) (2008), pp. 14:1--14:16.
[33]
Guo, H., Wu, M., Zhou, L., Hu, G., Yang, J., and Zhang, L. Practical software model checking via dynamic interface reduction. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP) (Cascais, Portugal, Oct. 2011), pp. 265--278.
[34]
Hunt, G. C., and Larus, J. R. Singularity: Rethinking the Software Stack. ACM SIGOPS Operating Systems Review 41, 2 (April 2007), 37--49.
[35]
Jang, D., Tatlock, Z., and Lerner, S. Establishing browser security guarantees through formal shim verification. In Proceedings of the 21st Usenix Security Symposium (Security) (Bellevue, WA, Aug. 2012), pp. 113--128.
[36]
Kara, J. fs/udf/namei.c at v4.1, 2015. https://github.com/torvalds/linux/commit/3adc12e9648291149a1e3f354d0ad158fc2571e7.
[37]
Kim, T., Chandra, R., and Zeldovich, N. Efficient patch-based auditing for web application vulnerabilities. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI) (Hollywood, CA, Oct. 2012).
[38]
Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., and Winwood, S. seL4: Formal verification of an os kernel. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP) (Big Sky, MT, Oct. 2009), pp. 207--220.
[39]
Kullback, S., and Leibler, R. A. On information and sufficiency. The annals of mathematical statistics (1951), 79--86.
[40]
Leroy, X. Formal certification of a compiler back-end or: Programming a compiler with a proof assistant. In Proceedings of the 33rd ACM Symposium on Principles of Programming Languages (Charleston, South Carolina, Jan. 2006), pp. 42--54.
[41]
Li, Z., Lu, S., Myagmar, S., and Zhou, Y. CP-Miner: A tool for finding copy-paste and related bugs in operating system code. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI) (San Francisco, CA, Dec. 2004).
[42]
Lu, L., Arpaci-Dusseau, A. C., Arpaci-Dusseau, R. H., and Lu, S. A study of Linux file system evolution. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST) (2013), pp. 31--44.
[43]
Madhavapeddy, A., Mortier, R., Rotsos, C., Scott, D., Singh, B., Gazagnaire, T., Smith, S., Hand, S., and Crowcroft, J. Unikernels: Library operating systems for the cloud. In Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (Houston, TX, Mar. 2013), pp. 461--472.
[44]
Musuvathi, M. S., Park, D., Park, D. Y. W., Chou, A., Engler, D. R., and Dill, D. L. CMC: A pragmatic approach to model checking real code. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI) (Boston, MA, Dec. 2002).
[45]
Palix, N., Thomas, G., Saha, S., Calvès, C., Lawall, J., and Muller, G. Faults in Linux: Ten years later. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (Newport Beach, CA, Mar. 2011), pp. 305--318.
[46]
Patocka, M. {Patch} hpfs: update ctime and mtime on directory modification, 2015. https://lkml.org/lkml/2015/9/2/552.
[47]
Rubio-González, C., Gunawi, H. S., Liblit, B., Arpaci-Dusseau, R. H., and Arpaci-Dusseau, A. C. Error propagation analysis for file systems. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation (Dublin, Ireland, June 2009), pp. 270--280.
[48]
Rubner, Y., Tomasi, C., and Guibas, L. J. The earth mover's distance as a metric for image retrieval. International journal of computer vision 40, 2 (2000), 99--121.
[49]
Ryzhyk, L., Chubb, P., Kuz, I., Le Sueur, E., and Heiser, G. Automatic device driver synthesis with Termite. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP) (Big Sky, MT, Oct. 2009), pp. 73--86.
[50]
Savage, S., Burrows, M., Nelson, G., Sobalvarro, P., and Anderson, T. Eraser: A dynamic data race detector for multi-threaded programs. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (Saint-Malo, France, Oct. 1997), pp. 27--37.
[51]
Serebryany, K., Bruening, D., Potapenko, A., and Vyukov, D. AddressSanitizer: A Fast Address Sanity Checker. In Proceedings of the 2012 ATC Annual Technical Conference (ATC) (Boston, MA, June 2012), pp. 309--318.
[52]
Shannon, C. E. A mathematical theory of communication. Bell system technical journal 27 (1948).
[53]
Swain, M. J., and Ballard, D. H. Color indexing. International journal of computer vision 7, 1 (1991), 11--32.
[54]
The IEEE and The Open Group. fsync(), 2013. The Open Group Base Specifications Issue 7, IEEE Std 1003.1, 2013 Edition, http://pubs.opengroup.org/onlinepubs/9699919799/functions/fsync.html.
[55]
The IEEE and The Open Group. rename(), 2013. The Open Group Base Specifications Issue 7, IEEE Std 1003.1, 2013 Edition, http://pubs.opengroup.org/onlinepubs/9699919799/functions/rename.html.
[56]
The Linux Programming Interface. fync(), 2014. Linux's Programmer's Manual, http://man7.org/linux/man-pages/man2/fsync.2.html.
[57]
The World Wide Web Consortium (W3C). Document Object Model (DOM) Level 2 HTML Specification, Jan. 2003. http://www.w3.org/TR/DOM-Level-2-HTML/Overview.html.
[58]
Torvalds, L. fs/ubifs/file.c at v4.0-rc2, 2015. https://github.com/torvalds/linux/blob/v4.0-rc2/fs/ubifs/file.c#L1321.
[59]
Torvalds, L. inlucde/linux/fs.h at v4.0-rc2, 2015. https://github.com/torvalds/linux/blob/v4.0-rc2/include/linux/fs.h#L1688.
[60]
Wang, X., Chen, H., Jia, Z., Zeldovich, N., and Kaashoek, M. F. Improving integer security for systems with KINT. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI) (Hollywood, CA, Oct. 2012), pp. 163--177.
[61]
Wang, X., Lazar, D., Zeldovich, N., Chlipala, A., and Tatlock, Z. Jitk: A trustworthy in-kernel interpreter infrastructure. In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI) (Broomfield, Colorado, Oct. 2014), pp. 33--47.
[62]
Wang, X., Zeldovich, N., Kaashoek, M. F., and Solar-Lezama, A. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP) (Farmington, PA, Nov. 2013), pp. 260--275.
[63]
Yamaguchi, F., Golde, N., Arp, D., and Rieck, K. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 35th IEEE Symposium on Security and Privacy (Oakland) (San Jose, CA, May 2014), pp. 590--604.
[64]
Yang, J., Sar, C., and Engler, D. explode: A lightweight, general system for finding serious storage system errors. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI) (Seattle, WA, Nov. 2006), pp. 10--10.
[65]
Yang, J., Sar, C., Twohey, P., Cadar, C., and Engler, D. Automatically generating malicious disks using symbolic execution. In Proceedings of the 27th IEEE Symposium on Security and Privacy (Oakland) (Oakland, CA, May 2006), pp. 243--257.
[66]
Yang, J., Twohey, P., and Dawson. Using model checking to find serious file system errors. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI) (San Francisco, CA, Dec. 2004), pp. 273--288.
[67]
Yin, Z., Ma, X., Zheng, J., Zhou, Y., Bairavasundaram, L. N., and Pasupathy, S. An empirical study on configuration errors in commercial and open source systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP) (Cascais, Portugal, Oct. 2011), pp. 159--172.
[68]
Yoshimura, T., and Kono, K. Who writes what checkers?---learning from bug repositories. In Proceedings of the 10th Workshop on Hot Topics in System Dependability (HotDep) (Broomfield, CO, Oct. 2014).
[69]
Yuan, D., Luo, Y., Zhuang, X., Rodrigues, G. R., Zhao, X., Zhang, Y., Jain, P. U., and Stumm, M. Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems. In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI) (Broomfield, Colorado, Oct. 2014), pp. 249--265.

Cited By

View all
  • (2024)Using dynamically layered definite releases for verifying the RefFS file systemProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691972(629-648)Online publication date: 10-Jul-2024
  • (2024)MetisProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650705(123-140)Online publication date: 27-Feb-2024
  • (2024)Effective Bug Detection with Unused DefinitionsProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629576(720-735)Online publication date: 22-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '15: Proceedings of the 25th Symposium on Operating Systems Principles
October 2015
499 pages
ISBN:9781450338349
DOI:10.1145/2815400
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • DARPA
  • ONR
  • NSF
  • ETRI

Conference

SOSP '15
Sponsor:

Acceptance Rates

SOSP '15 Paper Acceptance Rate 30 of 181 submissions, 17%;
Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)236
  • Downloads (Last 6 weeks)50
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Using dynamically layered definite releases for verifying the RefFS file systemProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691972(629-648)Online publication date: 10-Jul-2024
  • (2024)MetisProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650705(123-140)Online publication date: 27-Feb-2024
  • (2024)Effective Bug Detection with Unused DefinitionsProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629576(720-735)Online publication date: 22-Apr-2024
  • (2024)Strengthening Supply Chain Security with Fine-grained Safe Patch IdentificationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639104(1-12)Online publication date: 20-May-2024
  • (2024)APP-Miner: Detecting API Misuses via Automatically Mining API Path Patterns2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00043(4034-4052)Online publication date: 19-May-2024
  • (2023)Place your locks wellProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620446(3727-3744)Online publication date: 9-Aug-2023
  • (2023)All Use-After-Free Vulnerabilities Are Not Created Equal: An Empirical Study on Their Characteristics and DetectabilityProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3607199.3607229(623-638)Online publication date: 16-Oct-2023
  • (2023)The Security War in File Systems: An Empirical Study from A Vulnerability-centric PerspectiveACM Transactions on Storage10.1145/360602019:4(1-26)Online publication date: 3-Oct-2023
  • (2023)Understanding Persistent-memory-related Issues in the Linux KernelACM Transactions on Storage10.1145/360594619:4(1-28)Online publication date: 3-Oct-2023
  • (2023)One Simple API Can Cause Hundreds of Bugs An Analysis of Refcounting Bugs in All Modern Linux KernelsProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613162(52-65)Online publication date: 23-Oct-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media