Abstract
Log-based recovery protocols enable process replicas in distributed systems to replay a computation up to the point where a previous computation failed. One fundamental assumption underlying these protocols is the piecewise deterministic (PWD) execution model, stating that recovery must not execute, but simulate the execution of nondeterministic events in order to maintain consistency.
One such source of nondeterminism are asynchronous events triggering software signal handlers, an issue known to be solved by instruction counters. Efficient implementations in software have been shown to be practical, but require significant changes to applications and system software. Hardware counters, in contrast, allow running software unmodified. A number of processors implementing the Intel x86 instruction set architecture provide monitoring registers with properties similar to a true instruction counter.
Designed for application profiling, these facilities reveal a number issues to be resolved when utilized for applications like the PWD model, which demands for a maximum in precision during replay. We discuss some of the most prominent problems faced when using performance counters for protocols satisfying the PWD model. We present additional hardware mechanisms, eliminating inconsistencies in counter interrupt delivery, based on standard processor debugging facilities, and at the expense of a small number of additionally generated exceptions.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Elnozahy, E.N.M., Alvisi, L., Wang, Y.M., Johnson, D.B.: A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv. 34(3), 375–408 (2002)
Bressoud, T.C., Schneider, F.B.: Hypervisor-based fault tolerance. ACM Trans. Comput. Syst. 14(1), 80–107 (1996)
Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: SOSP 2003: Proceedings of the nineteenth ACM symposium on Operating systems principles, pp. 164–177. ACM Press, New York (2003)
Smith, J.E., Nair, R.: Virtual Machines: Versatile Platforms for Systems and Processes. Morgan Kaufmann Publishers, San Francisco (2005)
Slye, J.H., Elnozahy, E.N.: Support for software interrupts in log-based rollback-recovery. IEEE Trans. Comput. 47(10), 1113–1123 (1998)
Cargill, T.A., Locanthi, B.N.: Cheap hardware support for software debugging and profiling. In: ASPLOS-II: Proceedings of the second international conference on Architectural support for programming languages and operating systems, pp. 82–83. IEEE Computer Society Press, Los Alamitos (1987)
Mellor-Crummey, J.M., LeBlanc, T.J.: A software instruction counter. In: ASPLOS-III: Proceedings of the third international conference on Architectural support for programming languages and operating systems, pp. 78–86. ACM Press, New York (1989)
Intel Corporation: IA-32 Intel Architecture Software Developer’s Manual, vol. 3: System Programming Guide (2005), http://developer.intel.com/design/Pentium4/manuals/253668.htm
Graham, S.L., Kessler, P.B., McKusick, M.K.: gprof: a Call Graph Execution Profiler. In: SIGPLAN Symposium on Compiler Construction, pp. 120–126 (1982), http://citeseer.ist.psu.edu/graham82gprof.html
Intel Software Network: Intel VTune Performance Analyzer (2004), http://developer.intel.com
Advanced Micro Devices: BIOS AND Kernel Developer’s Guide for AMD Athlon 64 AND AMD Opteron Processors (2005), http://www.amd.com/
Intel Corporation: AMD64 Architecture Programmer’s Manual (2005), http://www.amd.com/
Intel Corporation: IA-32 Intel Architecture Software Developer’s Manual, vol. 1: Basic Architecture (2005), http://developer.intel.com/design/Pentium4/manuals/253665.htm
Hinton, G., Sager, D., Upton, M., Boggs, D., Karmean, D., Kyler, A., Roussel, P.: The Microarchitecture of the Pentium 4 Processor. Intel Technology Journal Q1 (2001), http://www.intel.com/technology/itj/q12001/pdf/art_2.pdf
Panchamukhi, P.: Kernel debugging with Kprobes. IBM developerWorks (2004), http://www-128.ibm.com/developerworks/library/l-kprobes.html
Advanced Micro Devices: AMD Athlon Processor Model 10 Revision Guide (2003), http://www.amd.com/
Intel Corporation: Intel Pentium 4 Processor Specification Update (2005), http://developer.intel.com/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stodden, D., Eichner, H., Walter, M., Trinitis, C. (2006). Hardware Instruction Counting for Log-Based Rollback Recovery on x86-Family Processors. In: Penkler, D., Reitenspiess, M., Tam, F. (eds) Service Availability. ISAS 2006. Lecture Notes in Computer Science, vol 4328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11955498_8
Download citation
DOI: https://doi.org/10.1007/11955498_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68724-5
Online ISBN: 978-3-540-68725-2
eBook Packages: Computer ScienceComputer Science (R0)