skip to main content
10.1145/3038912.3052595acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Public Access

AutoCyclone: Automatic Mining of Cyclic Online Activities with Robust Tensor Factorization

Published:03 April 2017Publication History

ABSTRACT

Given a collection of seasonal time-series, how can we find regular (cyclic) patterns and outliers (i.e. rare events)? These two types of patterns are hidden and mixed in the time-varying activities. How can we robustly separate regular patterns and outliers, without requiring any prior information?

We present CycloneM, a unifying model to capture both cyclic patterns and outliers, and CycloneFact, a novel algorithm which solves the above problem. We also present an automatic mining framework AutoCyclone, based on CycloneM and CycloneFact. Our method has the following properties; (a) effective: it captures important cyclic features such as trend and seasonality, and distinguishes regular patterns and rare events clearly; (b) robust and accurate: it detects the above features and patterns accurately against outliers; (c) fast: CycloneFact takes linear time in the data size and typically converges in a few iterations; (d) parameter free: our modeling framework frees the user from having to provide parameter values.

Extensive experiments on 4 real datasets demonstrate the benefits of the proposed model and algorithm, in that the model can capture latent cyclic patterns, trends and rare events, and the algorithm outperforms the existing state-of-the-art approaches. CycloneFact was up to 5 times more accurate and 20 times faster than top competitors.

References

  1. Monthly electricity statistics, international enegy agency. http://www.iea.org/statistics/monthlystatistics/monthlyelectricitystatistics/.Google ScholarGoogle Scholar
  2. Tropical atomosphere ocean project. http://www.pmel.noaa.gov/tao/data_deliv/deliv.html.Google ScholarGoogle Scholar
  3. R. Bro and H. A. Kiers. A new efficient method for determining the number of components in parafac models. Journal of chemometrics, 17(5):274--286, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  4. W. Cheng, K. Zhang, H. Chen, G. Jiang, and W. Wang. Ranking causal anomalies via temporal and dynamical analysis on vanishing correlations. In KDD, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. C. Ho, J. Ghosh, and J. Sun. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In KDD, pages 115--124, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Idé, A. C. Lozano, N. Abe, and Y. Liu. Proximity-based anomaly detection using sparse structure learning. In SDM, pages 97--108, 2009.Google ScholarGoogle Scholar
  7. R. Jiang, H. Fei, and J. Huan. Anomaly localization for network data streams with graph joint sparse pca. In KDD, pages 886--894, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. U. Kang, E. Papalexakis, A. Harpale, and C. Faloutsos. Gigatensor: scaling tensor analysis up by 100 times-algorithms and discoveries. In KDD, pages 316--324, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. Keogh, S. Chu, D. Hart, and M. Pazzani. An online algorithm for segmenting time series. In ICDM, pages 289--296, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM review, 51(3):455--500, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Li, B. A. Prakash, and C. Faloutsos. Parsimonious linear fingerprinting for time series. PVLDB, 3(1--2):385--396, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y.-R. Lin, J. Sun, H. Sundaram, A. Kelliher, P. Castro, and R. Konuru. Community discovery via metagraph factorization. TKDD, 5(3):17, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Mateos and G. B. Giannakis. Robust pca as bilinear decomposition with outlier-sparsity regularization. IEEE Transactions on Signal Processing, 60(10):5176--5190, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Matsubara and Y. Sakurai. Regime shifts in streams: Real-time forecasting of co-evolving time sequences. In KDD, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Matsubara, Y. Sakurai, and C. Faloutsos. Autoplait: Automatic mining of co-evolving time sequences. In SIGMOD, pages 193--204, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Matsubara, Y. Sakurai, and C. Faloutsos. The web as a jungle: Non-linear dynamical systems for co-evolving online activities. In WWW, pages 721--731, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Matsubara, Y. Sakurai, and C. Faloutsos. Non-linear mining of competing local activities. In WWW, pages 737--747, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Matsubara, Y. Sakurai, W. G. van Panhuis, and C. Faloutsos. Funnel: automatic mining of spatially coevolving epidemics. In KDD, pages 105--114, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. E. Papalexakis. Automatic unsupervised tensor mining with quality assessment. In SDM, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  20. E. E. Papalexakis and C. Faloutsos. Fast efficient and scalable core consistency diagnostic for the parafac decomposition for big sparse tensors. In ICASSP, pages 5441--5445, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Rogers, L. Li, and S. J. Russell. Multilinear dynamical systems for tensor time series. In NIPS, pages 2634--2642, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Wang, H. Wang, and W. Wang. Finding semantics in time series. In SIGMOD, pages 385--396, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Wang, R. Chen, J. Ghosh, J. C. Denny, A. Kho, Y. Chen, B. A. Malin, and J. Sun. Rubik: Knowledge guided tensor factorization and completion for health data analytics. In KDD, pages 1265--1274, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Yang, J. McAuley, J. Leskovec, P. LePendu, and N. Shah. Finding progression stages in time-evolving event sequences. In WWW, pages 783--794, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Ye and E. Keogh. Time series shapelets: a new primitive for data mining. In KDD, pages 947--956, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1):49--67, 2006.Google ScholarGoogle Scholar

Index Terms

  1. AutoCyclone: Automatic Mining of Cyclic Online Activities with Robust Tensor Factorization

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          WWW '17: Proceedings of the 26th International Conference on World Wide Web
          April 2017
          1678 pages
          ISBN:9781450349130

          Copyright © 2017 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

          Publisher

          International World Wide Web Conferences Steering Committee

          Republic and Canton of Geneva, Switzerland

          Publication History

          • Published: 3 April 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          WWW '17 Paper Acceptance Rate164of966submissions,17%Overall Acceptance Rate1,899of8,196submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader