skip to main content
column

SelfTalk for Dena: query language and runtime support for evaluating system behavior

Published:12 March 2010Publication History
Skip Abstract Section

Abstract

We introduce SelfTalk, a novel declarative language that allows users to query and understand the status of a large scale system. SelfTalk is sufficiently expressive to encode an administrator's high level hypotheses/expectations about normal system behavior, such as, "I expect that the throughputs across all system components are linearly correlated". SelfTalk works in conjunction with Dena, a runtime support system designed to help system administrators detect the root cause of system misbehavior quickly and accurately. Given a user hypothesis, Dena instantiates and validates it using actual monitored data within specific system contexts. We evaluate Dena by posing several hypotheses about system behavior and querying Dena to diagnose anomalies in a virtual storage system. We find that Dena can automatically validate the system performance based on the user hypotheses and also accurately diagnose system misbehavior.

References

  1. P. Barham, A. Donnelly, R. Isaacs, and R. Mortier. Using Magpie for Request Extraction and Workload Modelling. In OSDI, pages 259--272, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M.Y. Chen, A. Accardi, E. Kiciman, D.A. Patterson, A. Fox, and E.A. Brewer. Path-Based Failure and Evolution Management. In NSDI, pages 309--322, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M.Y. Chen, E. Kiciman, E. Fratkin, A. Fox, and E.A. Brewer. Pinpoint: Problem Determination in Large, Dynamic Internet Services. In DSN, pages 595--604, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. I. Cohen, J.S. Chase, M. Goldszmidt, T. Kelly, and J. Symons. Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control. In OSDI, pages 231--244, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. I. Cohen, S. Zhang, M. Goldszmidt, J. Symons, T. Kelly, and A. Fox. Capturing, indexing, clustering, and retrieving system history. In SOSP, pages 105--118, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Ghanbari and C. Amza. Semantic-Driven Model Composition for Accurate Anomaly Diagnosis. In ICAC, pages 35--44, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Z. Guo, G. Jiang, H. Chen, and K. Yoshihira. Tracking Probabilistic Correlation of Monitoring Data for Fault Detection in Complex Systems. In DSN, pages 259--268, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Jain. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation and Modelling. John Wiley & Sons, New York, 1991.Google ScholarGoogle Scholar
  9. G. Jiang, H. Chen, and K. Yoshihira. Discovering Likely Invariants of Distributed Transaction Systems for Autonomic System Management. Cluster Computing, 9(4):385--399, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J.O. Kephart and D.M. Chess. The Vision of Autonomic Computing. IEEE Computer, 36(1):41--50, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C.E. Killian, J.W. Anderson, R. Braud, R. Jhala, and A. Vahdat. Mace: Language Support for Building Distributed Systems. In PLDI, pages 179--188, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Lamport. Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Reynolds, C.E. Killian, J.L. Wiener, J.C. Mogul, M.A. Shah, and A. Vahdat. Pip: Detecting the Unexpected in Distributed Systems. In NSDI, pages 115--128, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Shen, M. Zhong, and C. Li. I/O System Performance Debugging Using Model-driven Anomaly Characterization. In FAST, pages 309--322, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Soundararajan, D. Lupei, S. Ghanbari, A.D. Popescu, J. Chen, and C. Amza. Dynamic Resource Allocation for Database Servers Running on Virtual Storage. In FAST, pages 71--84, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Thereska and G.R. Ganger. Ironmodel: Robust Performance Models in the Wild. In SIGMETRICS, pages 253--264, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H.J. Wang, J.C. Platt, Y. Chen, R. Zhang, and Y.-M. Wang. Automatic Misconfiguration Troubleshooting with PeerPressure. In OSDI, pages 245--258, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Zhou, V. Pandey, J. Sundaresan, A. Raghuraman, Y. Zhou, and S. Kumar. Dynamic Tracking of Page Miss Ratio Curve for Memory Management. In ASPLOS, pages 177--188, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SelfTalk for Dena: query language and runtime support for evaluating system behavior
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 44, Issue 1
      January 2010
      115 pages
      ISSN:0163-5980
      DOI:10.1145/1740390
      Issue’s Table of Contents

      Copyright © 2010 Copyright is held by the owner/author(s)

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 March 2010

      Check for updates

      Qualifiers

      • column

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader