Skip to main content
Log in

Software monitoring with controllable overhead

  • Runtime Verification
  • Published:
International Journal on Software Tools for Technology Transfer Aims and scope Submit manuscript

Abstract

We introduce the technique of software monitoring with controllable overhead (SMCO), which is based on a novel combination of supervisory control theory of discrete event systems and PID-control theory of discrete time systems. SMCO controls monitoring overhead by temporarily disabling monitoring of selected events for as short a time as possible under the constraint of a user-supplied target overhead o t. This strategy is optimal in the sense that it allows SMCO to monitor as many events as possible, within the confines of o t. SMCO is a general monitoring technique that can be applied to any system interface or API. We have applied SMCO to a variety of monitoring problems, including two highlighted in this paper: integer range analysis, which determines upper and lower bounds on integer variable values; and non-accessed period detection, which detects stale or underutilized memory allocations. We benchmarked SMCO extensively, using both CPU- and I/O-intensive workloads, which often exhibited highly bursty behavior. We demonstrate that SMCO successfully controls overhead across a wide range of target overhead levels; its accuracy monotonically increases with the target overhead; and it can be configured to distribute monitoring overhead fairly across multiple instrumentation points.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Aziz A., Balarin F., Brayton R.K., Dibenedetto M.D., Sladanha A., Sangiovanni-Vincentelli A.L.: Supervisory control of finite state machines. In: Wolper, P. (eds) 7th International Conference On Computer Aided Verification, vol. 939, pp. 279–292. Springer Liege, Belgium (1995)

    Google Scholar 

  2. Alur R., Dill D.L.: A theory of timed automata. Theoret. Comput. Sci. 126(2), 183–235 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  3. Arnold, M., Vechev, M., Yahav. E.: QVM: An efficient runtime for detecting defects in deployed systems. In: Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, Nashville, TN (2008)

  4. Callanan, S., Dean, D.J., Gorbovitski, M., Grosu, R. Seyster, J., Smolka, S.A., Stoller, S.D., Zadok, E.: Software monitoring with bounded overhead. In: Proceedings of the 2008 NSF Next Generation Software Workshop, in conjunction with the 2008 International Parallel and Distributed Processing Symposium (IPDPS 2008), Miami (2008)

  5. Callanan, S., Dean, D.J., Zadok, E.: Extending GCC with modular GIMPLE optimizations. In: Proceedings of the 2007 GCC Developers’ Summit, Ottawa (2007)

  6. Cantrill, B., Shapiro, M.W., Leventhal, A.H.: Dynamic instrumentation of production systems. In: Proceedings of the Annual USENIX Technical Conference, pp. 15–28 (2004)

  7. Fei, L., Midkiff, S.P.: Artemis: Practical runtime monitoring of applications for errors. Technical report TR-ECE-05-02. Electrical and Computer Engineering, Purdue University. http://docs.lib.purdue.edu/ecetr/4/ (2005)

  8. Fei, L., Midkiff, S.P.: Artemis: Practical runtime monitoring of applications for execution anomalies. In: Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’06), Ottawa, Canada (2006)

  9. Franklin G.F., Powell J.D., Workman M.: Digital Control of Dynamic Systems, Third Edition. Addison Wesley Longman, Inc, Boston (1998)

    Google Scholar 

  10. Hauswirth, M., Chilimbi, T.M.: Low-overhead memory leak detection using adaptive statistical profiling. In: Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2004), pp. 156–164 (2004)

  11. Henning J.L.: SPEC CPU2006 benchmark descriptions. Comput. Archit. News 34(4), 1–17 (2006)

    Article  MathSciNet  Google Scholar 

  12. Hoare C.A.R.: Communicating sequential processes. Commun. ACM 21, 666–677 (1978)

    Article  MATH  Google Scholar 

  13. Liblit, B., Aiken, A., Zheng, A.X., Jordan, M.I.: Bug isolation via remote program sampling. In: Proceedings of the 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’03), San Diego (2003)

  14. Moore, R.: A universal dynamic trace for Linux and other operating systems. In: Proceedings of the 2001 USENIX Annual Technical Conference (2001)

  15. Ramadge P.J., Wonham W.M.: Supervisory control of a class of discrete event systems. SIAM J. Control Optim. 25(1), 206–230 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  16. Ramadge P.J., Wonham W.M.: Supervisory control of timed discrete-event systems. IEEE Trans. Autom. Control 38(2), 329–342 (1994)

    Google Scholar 

  17. Seward, J., Nethercote, N., Fitzhardinge, J.: Valgrind. http://valgrind.kde.org (2004)

  18. Wang, Q.-G., Ye, Z., Cai, W.-J., Hang, C.-C.: PID control for multivariable processes. Lecture Notes in Control and Information Sciences. Springer, Berlin (2008)

  19. Wong-Toi, H., Hoffmann, G.: The control of dense real-time discrete event systems. In: Proceeedings of 30th Conference on Decision and Control, pp. 1527–1528, Brighton (1991)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Justin Seyster.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, X., Seyster, J., Callanan, S. et al. Software monitoring with controllable overhead. Int J Softw Tools Technol Transfer 14, 327–347 (2012). https://doi.org/10.1007/s10009-010-0184-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10009-010-0184-4

Keywords

Navigation