ABSTRACT
Software systems often exhibit a surprising flexibility in the range of execution paths they can take to produce an acceptable result. This flexibility enables new techniques that augment systems with the ability to productively tolerate a wide range of errors. We show how to exploit this flexibility to obtain transformations that improve reliability and robustness or trade off accuracy in return for increased performance or decreased power consumption. We discuss how to use empirical, probabilistic, and statistical reasoning to understand why these techniques work.
- H. Boehm and S. Adve. You don't know jack about shared variables or memory models. Commun. ACM, 55(2), 2012. Google ScholarDigital Library
- M. Carbin and M. C. Rinard. Automatically identifying critical input regions and code in applications. In ISSTA, pages 37--48, 2010. Google ScholarDigital Library
- M. Carbin, S. Misailovic, M. Kling, and M. C. Rinard. Detecting and escaping infinite loops with jolt. In ECOOP, pages 609--633, 2011. Google ScholarDigital Library
- B. Demsky and M. C. Rinard. Automatic detection and repair of errors in data structures. In OOPSLA, pages 78--95, 2003. Google ScholarDigital Library
- B. Demsky and M. C. Rinard. Data structure repair using goal-directed reasoning. In ICSE, pages 176--185, 2005. Google ScholarDigital Library
- B. Demsky and M. C. Rinard. Goal-directed reasoning for specification-based data structure repair. IEEE Trans. Software Eng., 32(12):931--951, 2006. Google ScholarDigital Library
- J. L. Lions. Ariane 5 flight 501 failure report by the inquiry board, July 1996. URL http://www.di.unito.it/damiani/ariane5rep.html.Google Scholar
- F. Long, V. Ganesh, M. Carbin, S. Sidiroglou, and M. Rinard. Automatic input rectification. In ICSE, 2012. Google ScholarDigital Library
- S. Misailovic, D. Kim, and M. Rinard. Parallelizing sequential programs with statistical accuracy tests. Technical Report MIT-CSAIL-TR-2010-038, MIT, 2010.Google Scholar
- S. Misailovic, S. Sidiroglou, H. Hoffmann, and M. C. Rinard. Quality of service profiling. In ICSE (1), pages 25--34, 2010. Google ScholarDigital Library
- S. Misailovic, D. Roy, and M. Rinard. Probabilistic and statistical analysis of perforated patterns. Technical Report MIT-CSAIL-TR-2011-003, MIT, 2011.Google Scholar
- S. Misailovic, D. M. Roy, and M. C. Rinard. Probabilistically accurate program transformations. In SAS, pages 316--333, 2011. Google ScholarDigital Library
- H. H. Nguyen and M. C. Rinard. Detecting and eliminating memory leaks using cyclic memory allocation. In ISMM, pages 15--30, 2007. Google ScholarDigital Library
- G. Novark, E. Berger, and B. Zorn. Exterminator: Automatically correcting memory errors with high probability. In PLDI, 2007. Google ScholarDigital Library
- M. Rinard. A lossy, synchronization-free, race-full, but still acceptably accurate parallel space-subdivision tree construction algorithm. Technical Report MIT-CSAIL-TR-2012-005, MIT, 2012.Google Scholar
- M. C. Rinard. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In ICS, pages 324--334, 2006. Google ScholarDigital Library
- M. C. Rinard. Using early phase termination to eliminate load imbalances at barrier synchronization points. In OOPSLA, pages 369--386, 2007. Google ScholarDigital Library
- M. C. Rinard. Living in the comfort zone. In OOPSLA, pages 611--622, 2007. Google ScholarDigital Library
- M. C. Rinard, C. Cadar, D. Dumitran, D. M. Roy, and T. Leu. A dynamic technique for eliminating buffer overflow vulnerabilities (and other memory errors). In ACSAC, pages 82--90, 2004. Google ScholarDigital Library
- M. C. Rinard, C. Cadar, D. Dumitran, D. M. Roy, T. Leu, and W. S. Beebee. Enhancing server availability and security through failure-oblivious computing. In OSDI, pages 303--316, 2004. Google ScholarDigital Library
- M. C. Rinard, C. Cadar, and H. H. Nguyen. Exploring the acceptability envelope. In OOPSLA Companion, pages 21--30, 2005. Google ScholarDigital Library
- S. Sidiroglou-Douskos, S. Misailovic, H. Hoffmann, and M. C. Rinard. Managing performance vs. accuracy trade-offs with loop perforation. In SIGSOFT FSE, pages 124--134, 2011. Google ScholarDigital Library
- Z. A. Zhu, S. Misailovic, J. A. Kelner, and M. C. Rinard. Randomized accuracy-aware program transformations for efficient approximate computations. In POPL, pages 441--454, 2012. Google ScholarDigital Library
Index Terms
- Obtaining and reasoning about good enough software
Recommendations
What to do when things go wrong: recovery in complex (computer) systems
AOSD Companion '12: Proceedings of the 11th annual international conference on Aspect-oriented Software Development CompanionWe present and analyze a range of techniques for recovering from faults in complex hardware and software systems, from classical techniques that attempt to preserve the abstraction of perfection in the presence of faults to emerging techniques that ...
Exception Handling and Software Fault Tolerance
Some basic concepts underlying the issue of fault-tolerant software design are investigated. Relying on these concepts, a unified point of view on programmed exception handling and default exception handling based on automatic backward recovery is ...
Relational characterizations of system fault tolerance
Fault tolerance is the ability of a system to continue delivering its services after faults have caused errors. We have argued, in the past, that complex and/or critical systems are best validated by a wide range of methods, including proving, testing, ...
Comments