skip to main content
10.1145/1978942.1978966acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

CueT: human-guided fast and accurate network alarm triage

Published:07 May 2011Publication History

ABSTRACT

Network alarm triage refers to grouping and prioritizing a stream of low-level device health information to help operators find and fix problems. Today, this process tends to be largely manual because existing tools cannot easily evolve with the network. We present CueT, a system that uses interactive machine learning to learn from the triaging decisions of operators. It then uses that learning in novel visualizations to help them quickly and accurately triage alarms. Unlike prior interactive machine learning systems, CueT handles a highly dynamic environment where the groups of interest are not known a-priori and evolve constantly. A user study with real operators and data from a large network shows that CueT significantly improves the speed and accuracy of alarm triage compared to the network's current practice.

References

  1. Appleby, K., Goldszmidt, G., and Steinder, M. Layered Event Correlation Engine for Multi-Domain Server Farms. Proc. INM 2001, IEEE (2001), 329--344.Google ScholarGoogle ScholarCross RefCross Ref
  2. Basu, S., Fisher, D., Drucker, S.M., and Lu, H. Assisting Users with Clustering Tasks by Combining Metric Learning and Classification. Proc. AAAI 2010.Google ScholarGoogle ScholarCross RefCross Ref
  3. Brugnosi, S., Bruno, G., Manione, R., Montariolo, E., Paschetta, E., and Sisto, L. An Expert System for Real Time Fault Diagnosis of the Italian Telecommunications Network. Proc. INM 1993, IEEE (1993), 617--628. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. des Jardins, M., MacGlashan, J., and Ferraioli, J. Interactive Visual Clustering. Proc. IUI 2007, ACM Press (2007), 361--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. EMC Ionix, http://www.emc.com/products/family/ionix-family.ht.Google ScholarGoogle Scholar
  6. Fails, J.A. and Olsen, Jr., D.R. Interactive Machine Learning. Proc. IUI 2003, ACM Press (2003), 39--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fisher, D., Maltz, D.A., Greenberg, A., Wang, X., Warncke, H., Robertson, G., and Czerwinski, M. Using Visualization to Support Network and Application Management in a Data Center. Proc. INM 2008, IEEE (2008), 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  8. Fogarty, J., Tan, D., Kapoor, A., and Winder, S. CueFlik: Interactive Concept Learning in Image Search. Proc. CHI 2008, ACM Press (2008), 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gardner, R.D. and Harle, D.A. Methods and Systems for Alarm Correlation. Proc. GLOBECOM 1996, IEEE (1996), 136--140.Google ScholarGoogle ScholarCross RefCross Ref
  10. HP OpenView, http://openview.hp.co.Google ScholarGoogle Scholar
  11. Jain, P., Kulis, B., Dhillon, I.S., and Grauman, K. Online Metric Learning and Fast Similarity Search. Proc. NIPS 2008, (2008), 761--768.Google ScholarGoogle Scholar
  12. Jakobson, G. and Weissman, M.D. Alarm Correlation: Correlating multiple network alarms improves telecommunications network surveillance and fault management. IEEE Network 7, 6 (1993), 52--59.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Klementtinen, M., Mannila, H., and Toivonen, H. Rule Discovery in Telecommunication Alarm Data. J. Network and Systems Management 7, 4 (1999), 395--423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Lakkaraju, K, Yurcik, W., and Lee, A.J. NVisionIP: Network Visualizations of System State for Security Situational Awareness. Proc. VizSEC/DMSEC 2004, ACM Press (2004), 65--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Liu, G., Mok, A.K., and Yang, E.J. Composite Events for Network Event Correlation. Proc. INM 1999, IEEE (1999), 247--260.Google ScholarGoogle ScholarCross RefCross Ref
  16. Spring, N., Mahajan, R., Wetherall, D., and Anderson, T. Measuring ISP Topologies with Rocketfuel. Proc. SIGCOMM 2002, ACM Press (2002), 133--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Steinder, M. and Sethi, A.S. A Survey of Fault Localization Techniques in Computer Networks. Science of Computer Programming 53, (2004), 165--194.Google ScholarGoogle ScholarCross RefCross Ref
  18. Yemini, S., Kliger, S., Mozes, E., Yemini, Y., and Ohsie, D. High Speed and Robust Event Correlation. IEEE Communications Magazine 34, 5 (1996), 82--90. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. CueT: human-guided fast and accurate network alarm triage

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CHI '11: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
          May 2011
          3530 pages
          ISBN:9781450302289
          DOI:10.1145/1978942

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 May 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CHI '11 Paper Acceptance Rate410of1,532submissions,27%Overall Acceptance Rate6,199of26,314submissions,24%

          Upcoming Conference

          CHI '24
          CHI Conference on Human Factors in Computing Systems
          May 11 - 16, 2024
          Honolulu , HI , USA

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader