Skip to main content

History Guided Low-Cost Change Detection in Streams

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5691))

Abstract

Change detection in continuous data streams is very useful in today’s computing environment. However, high computation overhead prevents many data mining algorithms from being used for online monitoring. We propose a history-guided low-cost change detection method based on the “s-monitor” approach. The “s-monitor” approach monitors the stream with simple models (“s-monitors”) which can reflect changes of complicated models. By interleaving frequent s-monitor checks and infrequent complicated model checks, we can keep a close eye on the stream without heavy computation overhead.

The selection of s-monitors is critical for successful change detection. History can often provide insights to select appropriate s-monitors and monitor the streams. We demonstrate this method using subspace cluster monitoring for log data and frequent item set monitoring for retail data. Our experiments show that this approach can catch more changes in a more timely manner with lower cost than traditional approaches.

The same approach can be applied to different models in various applications, such as monitoring live weather data, stock market fluctuations and network traffic streams.

This research has been supported by National Science Foundation Grant CCR-0121643.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C., Han, J., Wang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. In: Proc. of VLDB (2004)

    Google Scholar 

  2. Aggarwal, C., Han, J., Wang, J., Yu, P.S.: On demand classification of data streams. In: Proc. of the ACM SIGKDD (2004)

    Google Scholar 

  3. Aggarwal, C.C.: A framework for diagnosing changes in evolving data streams. In: Proc. of ACM SIGMOD, pp. 575–586 (2003)

    Google Scholar 

  4. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proc. of VLDB, pp. 81–92 (2003)

    Google Scholar 

  5. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. of ACM SIGMOD, pp. 94–105 (1998)

    Google Scholar 

  6. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of VLDB, pp. 487–499 (1994)

    Google Scholar 

  7. Babcock, B., Olston, C.: Distributed top-k monitoring. In: Proc. of ACM SIGMOD (2003)

    Google Scholar 

  8. Chandrasekaran, S., Franklin, M.J.: Streaming queries over streaming data. In: Proc. of VLDB, pp. 203–214 (2002)

    Google Scholar 

  9. Carney, D., et al.: Monitoring streams - a new class of data management applications. In: Proc. of VLDB (2002)

    Google Scholar 

  10. Ganti, V., Gehrke, J., Ramakrishnan, R.: Mining data streams under block evolution. SIGKDD Explorations 3(2), 1–10 (2002)

    Article  Google Scholar 

  11. Giles, C.L., Lawrence, S., Tsoi, A.C.: Noisy time series prediction using a recurrent neural network and grammatical inference. Machine Learning 44(1/2), 161–183 (2001)

    Article  MATH  Google Scholar 

  12. Guha, S., Koudas, N., Shim, K.: Data-streams and histograms. In: Proc. of STOC, pp. 471–475 (2001)

    Google Scholar 

  13. Huang, W., Omiecinski, E., Mark, L., Zhao, W.: S-monitors: Low-cost change detection in data streams. In: Proc. of AusDM (2005)

    Google Scholar 

  14. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proc. of ACM SIGKDD (2001)

    Google Scholar 

  15. Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: Proc. of VLDB, pp. 180–191 (2004)

    Google Scholar 

  16. Nagesh, H., Goil, S., Choudhary, A.: Mafia: Efficient and scalable subspace clustering for very large data sets. Technical Report 9906-010, Northwestern University (1999)

    Google Scholar 

  17. Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. SIGKDD Explorations 6(1), 90–105 (2004)

    Article  Google Scholar 

  18. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proc. of ACM SIGKDD (2003)

    Google Scholar 

  19. Yang, J., Yan, X., Han, J., Wang, W.: Discovering evolutionary classifier over high speed non-static stream. In: Advanced Methods for Knowledge Discovery from Complex Data. Springer, Heidelberg (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Huang, W., Omiecinski, E., Mark, L., Nguyen, M.Q. (2009). History Guided Low-Cost Change Detection in Streams. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2009. Lecture Notes in Computer Science, vol 5691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03730-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03730-6_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03729-0

  • Online ISBN: 978-3-642-03730-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics