Skip to main content

User Subjectivity in Change Modeling of Streaming Itemsets

  • Conference paper
Advanced Data Mining and Applications (ADMA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

Abstract

Online mining of changes from data streams is an important problem in view of growing number of applications such as network flow analysis, e-business, stock market analysis etc. Monitoring of these changes is a challenging task because of the high speed, high volume, only-one-look characteristics of the data streams. User subjectivity in monitoring and modeling of the changes adds to the complexity of the problem.

This paper addresses the problem of i) capturing user subjectivity and ii) change modeling, in applications that monitor frequency behavior of item-sets. We propose a three stage strategy for focusing on item-sets, which are of current interest to the user and introduce metrics that model changes in their frequency (support) behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abadi, D., Carney, D., et al.: Aurora: A Data Stream Management System. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 666–666. ACM Press, New York (2003)

    Chapter  Google Scholar 

  2. Adamic, L.A.: Zipf, Power-laws, and Pareto - A ranking tutorial. Information Dynamics Lab, HP Labs, Palo Alto, CA 94304

    Google Scholar 

  3. Aggarwal, C.C.: An Intuitive Framework for Understanding Changes in Evolving Data Streams. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002). IEEE Computer Society, Los Alamitos (2002)

    Google Scholar 

  4. Arasu, A., Manku, G.S.: Approximate Counts and Quantiles over Sliding Windows. In: ACM Symposium on PODS (2004)

    Google Scholar 

  5. Babcock, B., Babu, S., Datar, M., et al.: Models and Issues in Data Stream Systems. In: Proceedings of 21st ACM Symposium on PODS (2002)

    Google Scholar 

  6. Babcock, B., Babu, S., et al.: Maintaining Variance and K-Medians over Data Stream Windows. In: Proceedings of 22nd ACM Symposium on PODS, San Diego, CA (2003)

    Google Scholar 

  7. Babu, S., Widom, J.: Continuous Queries over Data Streams. Technical Report, Stanford University Database Group (March 2001)

    Google Scholar 

  8. Bhatnagar, V.: Intension Mining: A New Approach to Knowledge Discovery in Databases. PhD thesis, Jamia Millia Islamia, New Delhi, India (2001)

    Google Scholar 

  9. Brachman, R.J., Anand, T.: The Process of Knowledge Discovery in Databases. In: Advances in Knowledge Dicovery in Databases, ch. 2. AAAI/MIT Press (1996)

    Google Scholar 

  10. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining (2003)

    Google Scholar 

  11. Carney, D., Centintemel, U., et al.: Monitoring Streams: A New Class of Data Management Applications. In: Proceedings of the 28th VLDB Conference, China (2002)

    Google Scholar 

  12. Chang, J.H., Lee, W.S.: estWin:Adaptively Monitoring the Recent Change of Frequent Itemsets over Online Data Streams. In: Proceedings of the 12th CIKM, New Orleans, LA, USA, pp. 536–539 (2003)

    Google Scholar 

  13. Chang, J.H., Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 487–492 (2003)

    Google Scholar 

  14. Charikar, M., Chen, K., Farach-Colton, M.: Finding Frequent Items in Data Streams. Theor. Comput. Sci. 312(1), 3–15 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  15. Cormode, G., Muthukrishnan, S.: What is new: Finding Significant Differences in Network Data Streams. In: IEEE INFOCOM 2004 (2004)

    Google Scholar 

  16. CRISP-DM Homepage. CRoss Industry Standard Process for Data Mining, http://www.crisp-dm.org

  17. Datar, M., Gionis, A., Indyk, P., et al.: Maintaining Stream Statistics over Sliding Windows. In: Annual ACM-SIAM SODA (January 2002)

    Google Scholar 

  18. Domingos, P., Hulten, G.: Catching Up with the Data: Research Issues in Mining Data Streams. In: ACM SIGMOD Workshop on Research issues in Data Mining and Knowledge Discovery (2001)

    Google Scholar 

  19. Dong, G., Han, J., Lakshmanan, L.V.S., et al.: Online Mining of Changes from Data Streams: Research Problems and Preliminary Results. In: Proceedings of the ACM SIGMOD Workshop on Management and Processing of Data Streams (2003)

    Google Scholar 

  20. Ganti, V., Gehrke, J., Ramakrishnan, R., et al.: FOCUS: A Framework for Measuring Differences in Data Characterstics. In: Proc. of 18th Symposium on PODS (1999)

    Google Scholar 

  21. Cormode, G., Muthukrishnan, S.: What’s Hot and What’s Not: Tracking Most Frequent Items Dynamically. In: Proceedings of the 22nd ACM SIGMODSIGACT- SIGART symposium on PODS, pp. 296–306. ACM Press, New York (2003)

    Google Scholar 

  22. Guralnik, V., Srivastava, J.: Event Detection from Time Series Data. In: Proceedings of the fifth ACM SIGKDD 1999, pp. 33–42 (1999)

    Google Scholar 

  23. The STREAM Group. STREAM: The Stanford stream data manager. IEEE Data Engineering Bulletin 26(1) (2003)

    Google Scholar 

  24. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proceedings of Int’l. Conf. SIGMOD 2000 (May 2000)

    Google Scholar 

  25. Henzinger, M.R., Raghvan, P., Rajgopalan, S.: Computing on Data Streams. SRC Technical Note 1998 -011, Digital Systems Research Center, Palo Alto, California (May 1998)

    Google Scholar 

  26. Hulten, G., Spencer, L., Domingos, P.: Mining Time-Changing Data Streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 97–106. ACM Press, New York (2001)

    Chapter  Google Scholar 

  27. Chen, J., Dewitt, D., Tian, F., Wang, Y.: Niagracq: A Scalable Continuous Query System for Internet Databases, pp. 379–390 (2000)

    Google Scholar 

  28. Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: Proceedings of the 28th Intl. Conf. on VLDB, Hong Kong, China (August 2002)

    Google Scholar 

  29. Muthukrishnan, S.: Data streams: Algorithms and Applications. In: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, pp. 413–413 (2003)

    Google Scholar 

  30. Gupta, S.K., Bhatnagar, V., et al.: Architecture for Knowledge Discovery and Knowledge Management. Knowledge and Information System Journal 7(3), 310–336 (2005)

    Article  Google Scholar 

  31. Zhu, Y., Shasha, D.: StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time. In: International Conference on VLDB, China (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bhatnagar, V., Kochhar, S.K. (2005). User Subjectivity in Change Modeling of Streaming Itemsets. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_96

Download citation

  • DOI: https://doi.org/10.1007/11527503_96

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27894-8

  • Online ISBN: 978-3-540-31877-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics