research-article

Monitoring multi-tier clustered systems with invariant metric relationships

Authors:
Mohammad Ahmad Munawar

University of Waterloo, Waterloo, ON, Canada

University of Waterloo, Waterloo, ON, Canada
View Profile

,
Michael Jiang

University of Waterloo, Waterloo, ON, Canada

University of Waterloo, Waterloo, ON, Canada
View Profile

,
Paul A. S. Ward

University of Waterloo, Waterloo, ON, Canada

University of Waterloo, Waterloo, ON, Canada
View Profile

SEAMS '08: Proceedings of the 2008 international workshop on Software engineering for adaptive and self-managing systemsMay 2008Pages 73–80https://doi.org/10.1145/1370018.1370032

Published:12 May 2008Publication History

SEAMS '08: Proceedings of the 2008 international workshop on Software engineering for adaptive and self-managing systems

Pages 73–80

ABSTRACT

To ensure high availability, self-managing systems require self-monitoring and a system model against which to analyze monitoring data. Characterizing relationships between system metrics has been shown to model simple multi-tier transaction systems effectively, enabling failure detection and fault diagnosis. In this paper we show how to extend this invariant metric-relationships approach to clustered multi-tier systems. We show through analysis and experimentation that naive application of the approach increases cost dramatically while reducing diagnosis accuracy. We demonstrate that randomization at the load balancer during the invariant-identification phase will improve diagnosis accuracy, though it neither completely eliminates the problem nor reduces the cost; indeed, it may increase the cost, as this approach will require a long learning phase to remove all accidental correlations. Finally, we argue that knowing the system structure is necessary to effectively apply invariants to the clustered environment.

References

M. Agarwal, N. Anerousis, M. Gupta, V. Mann, L. Mummert, and N. Sachindran. Problem determination in enterprise middleware systems using change point correlation of time series data. In NOMS, April 2006.Google Scholar
A. Brown, G. Kar, and A. Keller. An active approach to characterizing dynamic dependencies for problem determination in a distributed environment. In IM, 2001.Google Scholar
I. Cohen, M. Goldszmidt, T. Kelly, J. Symons, and J. Chase. Correlating instrumentation data to system states: A building block for automated diagnosis and control. In OSDI, pages 231--244, December 2004. Google ScholarDigital Library
J. Coleman and T. Lau. Set up and run a Trade6 benchmark with DB2 UDB. IBM developerWorks. http://www128.ibm.com/developerworks/edu/dm-dw-dm-0506lau.html?S_TACT=105AGX11&S_CMP=LIB.Google Scholar
Y. Diao, F. Eskesen, S. Froehlich, J. L. Hellerstein, A. Keller, L. Spainhower, and M. Surendra. Generic on-line discovery of quantitative models for service level management. In IM, pages 157--170, 2003.Google ScholarCross Ref
Z. Guo, G. Jiang, H. Chen, and K. Yoshihira. Tracking probabilistic correlation of monitoring data for fault detection in complex systems. In DSN, pages 259--268, 2006. Google ScholarDigital Library
M. Hauswirth, P. F. Sweeney, A. Diwan, and M. Hind. Vertical profiling: Understanding the behavior of object-oriented applications. In OOPSLA, 2004. Google ScholarDigital Library
R. Jain. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation and Modeling. Wiley, New York, 1991.Google Scholar
G. Jiang, H. Chen, and K. Yoshihira. Discovering likely invariants of distributed transaction systems for autonomic system management. In ICAC, 2006.Google ScholarDigital Library
G. Jiang, H. Chen, and K. Yoshihira. Modeling and tracking of transaction flow dynamics for fault detection in complex systems. IEEE Transactions on Dependable and Secure Computing, pages 312--326, 2006. Google ScholarDigital Library
J. O. Kephart and D. M. Chess. The vision of autonomic computing. IEEE Computer, 36(1):41--50, 2003. Google ScholarDigital Library
E. Kiciman and A. Fox. Detecting application-level failures in component based internet services. IEEE Trans. on Neural Networks, 16(5):1027--1041, Sept. 2005. Google ScholarDigital Library
J. Mickens, M. Szummer, and D. Narayanan. Snitch: Interactive decision trees for troubleshooting misconfigurations. In SysML, April 2007. Google ScholarDigital Library
J. Moore, J. Chase, P. Ranganathan, and R. Sharma. Making scheduling "cool": temperature-aware workload placement in data centers. In USENIX ATEC, 2005. Google ScholarDigital Library
M. A. Munawar, K. Quan, and P. A. Ward. Integrating Monitoring data for problem determination in business-critical software systems. In JoATC, 2008.Google Scholar
M. A. Munawar and P. A. Ward. Adaptive monitoring in enterprise software systems. In SysML, June 2006.Google Scholar
M. A. Munawar and P. A. Ward. A comparative study of pairwise regression techniques for problem determination. In CASCON, pages 152--166, 2007. Google ScholarDigital Library
M. A. Munawar and P. A. S. Ward. Leveraging many simple statistical models to adaptively monitor software systems. In ISPA, pages 457--470, August 2007. Google ScholarDigital Library
S. Pertet, R. Gandhi, and P. Narasimhan. Fingerpointing correlated failures in replicated systems. In SysML, April 2007. Google ScholarDigital Library
Sun Microsystems Inc. JMX - Java Management Extensions. Available at http://java.sun.com/products/JavaManagement/.Google Scholar
H. J. Wang, J. C. Platt, Y. Chen, R. Zhang, and Y.-M. Wang. Automatic misconfiguration troubleshooting with Peerpressure. In OSDI, pages 17--17,Berkeley, CA, USA, 2004. USENIX Association. Google ScholarDigital Library

Index Terms

Monitoring multi-tier clustered systems with invariant metric relationships

Recommendations

Performance modeling and analysis of virtualized multi-tier applications under dynamic workloads

Virtual machine technology facilitates implementation of modern Internet services, especially multi-tier applications. Server virtualization aims to reduce the cost of service provisioning and improve fault tolerance, portability and security of ...
Read More
Untangling mixed information to calibrate resource utilization in virtual machines
ICAC '11: Proceedings of the 8th ACM international conference on Autonomic computing

Server virtualization brings benefits in autonomic resource management, but also leads to new challenges. The challenge the paper addresses is on profiling physical resource utilization information of VMs when consolidated on a single server. Profiling ...
Read More
VCONF: a reinforcement learning approach to virtual machines auto-configuration
ICAC '09: Proceedings of the 6th international conference on Autonomic computing

Virtual machine (VM) technology enables multiple VMs to share resources on the same host. Resources allocated to the VMs should be re-configured dynamically in response to the change of application demands or resource supply. Because VM execution ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SEAMS '08: Proceedings of the 2008 international workshop on Software engineering for adaptive and self-managing systems
May 2008
144 pages
ISBN:9781605580371
DOI:10.1145/1370018
Program Chairs:
Betty Cheng
Michigan State University, USA
,
Rogério de Lemos
University of Kent, UK
,
David Garlan
Carnegie Mellon University, USA
,
Holger Giese
Hasso Plattner Institute, Germany
,
Marin Litoiu
IBM Toronto Lab, Canada
,
Jeff Magee
Imperial College, UK
,
Hausi Müller
University of Victoria, Canada
,
Richard Taylor
University of California, Irvine, USA
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 May 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
autonomic computing
behavioural models
system management
Qualifiers
- research-article
Conference

Acceptance Rates
SEAMS '08 Paper Acceptance Rate17of31submissions,55%Overall Acceptance Rate17of31submissions,55%
More

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 325
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Monitoring multi-tier clustered systems with invariant metric relationships

SEAMS '08: Proceedings of the 2008 international workshop on Software engineering for adaptive and self-managing systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Performance modeling and analysis of virtualized multi-tier applications under dynamic workloads

Untangling mixed information to calibrate resource utilization in virtual machines

VCONF: a reinforcement learning approach to virtual machines auto-configuration