skip to main content
10.1145/502512.502562acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Finding simple intensity descriptions from event sequence data

Published:26 August 2001Publication History

ABSTRACT

Sequences of events are an important type of data arising in various applications, including telecommunications, bio-statistics, web access analysis, etc. A basic approach to modeling such sequences is to find the underlying intensity functions describing the expected number of events per time unit. Typically, the intensity functions are assumed to be piecewise constant. We therefore consider different ways of fitting intensity models to event sequence data. We start by considering a Bayesian approach using Markov chain Monte Carlo (MCMC) methods with varying number of pieces. These methods can be used to produce posterior distributions on the intensity functions and they can also accomodate covariates. The drawback is that they are computationally intensive and thus are not very suitable for data mining applications in which large numbers of intensity functions have to be estimated. We consider dynamic programming approaches to finding the change points in the intensity functions. These methods can find the maximum likelihood intensity function in O(n2k) time for a sequence of n events and k different pieces of intensity. We show that simple heuristics can be used to prune the number of potential change points, yielding speedups of several orders of magnitude. The results of the improved dynamic programming method correspond very closely with the posterior averages produced by the MCMC methods.

References

  1. 1.E. Arjas. Survival models and martingale dynamics. Scandinavian Journal of Statistics, 16:177-225, 1989.Google ScholarGoogle Scholar
  2. 2.E. Arjas and J. Heikkinen. An Algorithm for nonparametric Bayesian estimation of a Poisson intensity. Computational Statistics, 12:385-402, 1997.Google ScholarGoogle Scholar
  3. 3.D. Hawkins. Point estimation of parameters of piecewise regression models. Journal of The Royal Statistical Society Series C, 25(1):51-57, 1976.Google ScholarGoogle Scholar
  4. 4.M. Eerola, H. Mannila, and M. Salmenkivi. Frailty factors and time-dependent hazards in modeling ear infections in children using Bassist. In Prec. of XIII Symposium on Computational Statistics, pages 287-292, Bristol, June 1998.Google ScholarGoogle Scholar
  5. 5.P. Green. Reversible jump Marker chain Monte Carlo computation and Bayesian model determination. Biometrika, 82(4):711-732, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  6. 6.P. Guttorp. Stochastic modeling of scientific data. Chapman and Hall, London, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  7. 7.M. Klemettinen, H. Mannila, and H. Toivonen. Interactice exploration of interesting findings in TASA. Information and Software Technology, Special Issue on Knowledge Discovery and Data Mining, 1999.Google ScholarGoogle Scholar
  8. 8.L. Tierney. Markov chains for exploring posterior distributions. Annals of Statistics, 22(4):1701-1728, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  9. 9.S. Chib and E. Greenberg. Understanding the Metropolis-Hastings algorithm. The American Statistician, 49:327-335, 1995.Google ScholarGoogle Scholar
  10. 10.V. Guralnik and J. Srivastava. Event detection from time series data. In Prec. of the 5th International Conference in Knowledge discovery and Data Mining, San Diego, August 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.D. W.Gilks, S.Richardson. Marker chain Monte Carlo in practice. Chapman and Hall, London, 1996.Google ScholarGoogle Scholar

Index Terms

  1. Finding simple intensity descriptions from event sequence data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
        August 2001
        493 pages
        ISBN:158113391X
        DOI:10.1145/502512

        Copyright © 2001 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 August 2001

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        KDD '01 Paper Acceptance Rate31of237submissions,13%Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader