Abstract
Online mining of changes from data streams is an important problem in view of growing number of applications such as network flow analysis, e-business, stock market analysis etc. Monitoring of these changes is a challenging task because of the high speed, high volume, only-one-look characteristics of the data streams. User subjectivity in monitoring and modeling of the changes adds to the complexity of the problem.
This paper addresses the problem of i) capturing user subjectivity and ii) change modeling, in applications that monitor frequency behavior of item-sets. We propose a three stage strategy for focusing on item-sets, which are of current interest to the user and introduce metrics that model changes in their frequency (support) behavior.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abadi, D., Carney, D., et al.: Aurora: A Data Stream Management System. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 666–666. ACM Press, New York (2003)
Adamic, L.A.: Zipf, Power-laws, and Pareto - A ranking tutorial. Information Dynamics Lab, HP Labs, Palo Alto, CA 94304
Aggarwal, C.C.: An Intuitive Framework for Understanding Changes in Evolving Data Streams. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002). IEEE Computer Society, Los Alamitos (2002)
Arasu, A., Manku, G.S.: Approximate Counts and Quantiles over Sliding Windows. In: ACM Symposium on PODS (2004)
Babcock, B., Babu, S., Datar, M., et al.: Models and Issues in Data Stream Systems. In: Proceedings of 21st ACM Symposium on PODS (2002)
Babcock, B., Babu, S., et al.: Maintaining Variance and K-Medians over Data Stream Windows. In: Proceedings of 22nd ACM Symposium on PODS, San Diego, CA (2003)
Babu, S., Widom, J.: Continuous Queries over Data Streams. Technical Report, Stanford University Database Group (March 2001)
Bhatnagar, V.: Intension Mining: A New Approach to Knowledge Discovery in Databases. PhD thesis, Jamia Millia Islamia, New Delhi, India (2001)
Brachman, R.J., Anand, T.: The Process of Knowledge Discovery in Databases. In: Advances in Knowledge Dicovery in Databases, ch. 2. AAAI/MIT Press (1996)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining (2003)
Carney, D., Centintemel, U., et al.: Monitoring Streams: A New Class of Data Management Applications. In: Proceedings of the 28th VLDB Conference, China (2002)
Chang, J.H., Lee, W.S.: estWin:Adaptively Monitoring the Recent Change of Frequent Itemsets over Online Data Streams. In: Proceedings of the 12th CIKM, New Orleans, LA, USA, pp. 536–539 (2003)
Chang, J.H., Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 487–492 (2003)
Charikar, M., Chen, K., Farach-Colton, M.: Finding Frequent Items in Data Streams. Theor. Comput. Sci. 312(1), 3–15 (2004)
Cormode, G., Muthukrishnan, S.: What is new: Finding Significant Differences in Network Data Streams. In: IEEE INFOCOM 2004 (2004)
CRISP-DM Homepage. CRoss Industry Standard Process for Data Mining, http://www.crisp-dm.org
Datar, M., Gionis, A., Indyk, P., et al.: Maintaining Stream Statistics over Sliding Windows. In: Annual ACM-SIAM SODA (January 2002)
Domingos, P., Hulten, G.: Catching Up with the Data: Research Issues in Mining Data Streams. In: ACM SIGMOD Workshop on Research issues in Data Mining and Knowledge Discovery (2001)
Dong, G., Han, J., Lakshmanan, L.V.S., et al.: Online Mining of Changes from Data Streams: Research Problems and Preliminary Results. In: Proceedings of the ACM SIGMOD Workshop on Management and Processing of Data Streams (2003)
Ganti, V., Gehrke, J., Ramakrishnan, R., et al.: FOCUS: A Framework for Measuring Differences in Data Characterstics. In: Proc. of 18th Symposium on PODS (1999)
Cormode, G., Muthukrishnan, S.: What’s Hot and What’s Not: Tracking Most Frequent Items Dynamically. In: Proceedings of the 22nd ACM SIGMODSIGACT- SIGART symposium on PODS, pp. 296–306. ACM Press, New York (2003)
Guralnik, V., Srivastava, J.: Event Detection from Time Series Data. In: Proceedings of the fifth ACM SIGKDD 1999, pp. 33–42 (1999)
The STREAM Group. STREAM: The Stanford stream data manager. IEEE Data Engineering Bulletin 26(1) (2003)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proceedings of Int’l. Conf. SIGMOD 2000 (May 2000)
Henzinger, M.R., Raghvan, P., Rajgopalan, S.: Computing on Data Streams. SRC Technical Note 1998 -011, Digital Systems Research Center, Palo Alto, California (May 1998)
Hulten, G., Spencer, L., Domingos, P.: Mining Time-Changing Data Streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 97–106. ACM Press, New York (2001)
Chen, J., Dewitt, D., Tian, F., Wang, Y.: Niagracq: A Scalable Continuous Query System for Internet Databases, pp. 379–390 (2000)
Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: Proceedings of the 28th Intl. Conf. on VLDB, Hong Kong, China (August 2002)
Muthukrishnan, S.: Data streams: Algorithms and Applications. In: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, pp. 413–413 (2003)
Gupta, S.K., Bhatnagar, V., et al.: Architecture for Knowledge Discovery and Knowledge Management. Knowledge and Information System Journal 7(3), 310–336 (2005)
Zhu, Y., Shasha, D.: StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time. In: International Conference on VLDB, China (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bhatnagar, V., Kochhar, S.K. (2005). User Subjectivity in Change Modeling of Streaming Itemsets. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_96
Download citation
DOI: https://doi.org/10.1007/11527503_96
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)