research-article

Interactive stream mining of maximal frequent itemsets allowing flexible time intervals and support thresholds

Authors:
Ming-Yen Lin

Feng Chia University, Taichung, Taiwan

Feng Chia University, Taichung, Taiwan
View Profile

,
Sue-Chen Hsueh

Chaoyang University of Technology, Taichung, Taiwan

Chaoyang University of Technology, Taichung, Taiwan
View Profile

,
Chien-Hsiang Tung

Feng Chia University, Taichung, Taiwan

Feng Chia University, Taichung, Taiwan
View Profile

ICUIMC '10: Proceedings of the 4th International Conference on Uniquitous Information Management and CommunicationJanuary 2010Article No.: 35Pages 1–8https://doi.org/10.1145/2108616.2108659

Published:14 January 2010Publication History

ICUIMC '10: Proceedings of the 4th International Conference on Uniquitous Information Management and Communication

Pages 1–8

ABSTRACT

Stream data mining is to extract useful patterns or knowledge from continuous, rapid data elements in modern applications. The discovery of frequent patterns in data streams generally is constrained by the usage of bounded memory and computation time. Most algorithms for mining frequent itemsets in streaming transactions assume a fixed minimum threshold and an unchangeable time interval. The support threshold, however, should be changeable to cope with the needs of the users and the characteristics of the incoming data. In addition, allowing the specification of the interesting time period of data may enhance the discovered knowledge. Still, the number of frequent itemsets might be too large to discovering the trends or changes. Thus, maximal frequent itemsets (MFIs) with respect to a changeable support in a user specified period become a favorable objective in stream data mining. In this paper, we propose an algorithm named VIMFI for mining MFIs in a data stream, allowing an arbitrary time interval and support threshold. A bounded memory space is allocated for summarizing all the transactions. VIMFI appends transactions to the summary structure and compresses the structure when it becomes full. Corresponding transactions in the specified interval will be extracted and a mining will be performed for the desired MFIs within that interval. Experiments using both synthetic and real-world datasets demonstrate that VIMFI efficiently mines MFIs in data streams with flexible time intervals and changeable support thresholds.

References

Agrawal, R. and Srikant, R. 1994. Fast algorithm for mining association rules. Proceedings of the 20th International Conference on Very Large Databases. pp. 487--499. Google ScholarDigital Library
Ayres, J., Flannick, J., Gehrke, J., and Yiu, T. 2002. Sequential pattern mining using a bitmap representation. Proceedings of the 8th ACM International Conference on Knowledge Discovery and Data Mining. pp. 429--435, July. Google ScholarDigital Library
Bayardo Jr., R. J. 1998. Efficiently mining long patterns from databases. Proceeding of Special Interest Group on Management of Data. pp. 85--93, June. Google ScholarDigital Library
Burdick, D., Calimlim, M., Flannick, J., Gehrke, J., and Yiu, T. 2005. MAFIA: a maximal frequent itemset algorithm. IEEE Transactions on Knowledge and Data Engineering. Vol. 17, No. 11, pp. 1490--1504, November. Google ScholarDigital Library
Chang, J. H. and Lee, W. S. 2006. Finding recent frequent itemsets adaptively over online transactional data streams. Information Systems. Vol. 31, No. 8, pp. 849--869. Google ScholarDigital Library
Dong, J. and Han, M. 2007. BitTableFI: an efficient mining frequent itemsets algorithm. Knowledge-Based Systems. Vol. 20, No. 4, pp. 329--335, May. Google ScholarDigital Library
Giannella, C., Han, J., Pei, J., Yan, X., and Yu, P. S. 2002. Mining frequent patterns in data streams at multiple time granularities. Proceedings of the NSF Workshop on Next Generation Data Mining.Google Scholar
Gouda, K. and Zaki, M. J. 2005. GenMax: an efficient algorithm for mining maximal frequent itemsets. Data Mining and Knowledge Discovery. Vol. 11, No.1, pp. 223--242, July. Google ScholarDigital Library
Grahne, G. and Zhu, J. 2003. High performance mining of maximal frequent itemsets. Proceedings of 6th International Workshop on High Performance Data Mining. pp. 135--143.Google Scholar
Ju, S. and Chen, C. 2008. MMFI: an effective algorithm for mining maximal frequent itemsets. Proceedings of International Symposium on Information Processing. pp. 144--148. Google ScholarDigital Library
Koh, J. L. and Shin, S. N. 2006. An approximate approach for mining recently frequent itemsets from data streams. Proceedings of the 8th International Conference on Data Warehousing and Knowledge Discovery (Poland, September). pp. 352--362. Google ScholarDigital Library
Li, H. F. and Lee, S. Y. 2009. Approximate mining of maximal frequent itemsets in data streams with different window models. Expert Systems with Applications. Vol. 35, No. 3, pp. 781--789. Google ScholarDigital Library
Li, H. F., Shan, M. K., and Lee, S. Y. 2008. DSM-FI: an efficient algorithm for mining frequent itemsets in data streams. Knowledge and Information Systems. Vol. 17, No. 1, pp. 79--97. Google ScholarDigital Library
Lin, M. Y., Hsueh, S. C., and Wang, C. Y. 2008. Interactive mining of frequent patterns in a data stream of time-fading models. Proceedings of the 8th International Conference on Intelligent System Design and Applications. Vol. 1, pp. 513--518, November. Google ScholarDigital Library
Lin, M. Y., Hsueh, S. C., and Hwang, S. K. 2008. Interactive mining of frequent itemsets over arbitrary time intervals in a data stream. CPRIT Vol. 75, Database Technologies 2008 (ADC 2008). Fekete, A. and Lin, X., Eds., pp. 15--21, January. Google ScholarDigital Library
Lin, M. Y., Hsueh, S. C., and Hwang, S. K. 2006. Variable support mining of frequent itemsets over data streams using synopsis vectors. Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (Singapore, April). pp. 724--728. Google ScholarDigital Library
Lin, C. H., Chiu, D. Y., Wu, Y. H., and Chen, L. P. 2005. Mining frequent itemsets from data streams with a time-sensitive sliding window. Proceedings of SIAM International Conference on Data Mining. April.Google Scholar
Liu, G. P. 2008. Interactive Mining of Frequent Itemsets in Data Streams. Master Thesis. Feng Chia University.Google Scholar
Manku, G. S. and Motwani, R. 2002. Approximate frequency counts over data streams. Proceedings of the 28th VLDB Conference (Hong Kong, China, August). pp. 346--357. Google ScholarDigital Library
Wo, H. J. and Lee, W. S. 2007. EstMax: tracing maximal frequent itemsets over online data streams. Proceedings of the 7th IEEE International Conference on Data Mining. pp. 709--714, October. Google ScholarDigital Library
Wong, C. W. and Fu, W. C. 2006. Mining top-k frequent itemsets from data streams. Data Mining and Knowledge Discovery. Vol. 13, No. 2, pp. 193--217, September. Google ScholarDigital Library
Zou, Q., Chu, W., and Lu, B. 2002. SmartMiner: a DepthFirst algorithm guided by tail information for mining maximal frequent itemsets. Proceedings of the IEEE International Conference on Data Mining. pp. 570--590, December. Google ScholarDigital Library

Index Terms

Interactive stream mining of maximal frequent itemsets allowing flexible time intervals and support thresholds
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

A novel approach for data stream maximal frequent itemsets mining

This paper proposes a novel algorithm AMMFI based on self-adjusting and orderly compound policy to solve the problems of existing algorithms for mining maximal frequent itemsets in a data stream. The proposed algorithm processes the data stream based on ...
Read More
Interactive mining of frequent itemsets over arbitrary time intervals in a data stream
ADC '08: Proceedings of the nineteenth conference on Australasian database - Volume 75

Mining frequent patterns in a data stream is very challenging for the high complexity of managing patterns with bounded memory against the unbounded data. While many approaches assume a fixed support threshold, a changeable threshold is more realistic, ...
Read More
Mining maximal frequent itemsets from data streams

Frequent pattern mining from data streams is an active research topic in data mining. Existing research efforts often rely on a two-phase framework to discover frequent patterns: (1) using internal data structures to store meta-patterns obtained by ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICUIMC '10: Proceedings of the 4th International Conference on Uniquitous Information Management and Communication
January 2010
550 pages
ISBN:9781605588933
DOI:10.1145/2108616
Conference Chairs:
Kwan-Ho You
Sungkyunkwan University, Korea
,
Sang-Won Lee
Sungkyunkwan University, Korea
,
General Chairs:
Won Kim
Sungkyunkwan University, Korea
,
Dongho Won
SungKyunkwan University, Korea
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 January 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data stream
interactive mining
maximal frequent itemsets
variable interval
variable support
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate251of941submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 81
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Interactive stream mining of maximal frequent itemsets allowing flexible time intervals and support thresholds

ICUIMC '10: Proceedings of the 4th International Conference on Uniquitous Information Management and Communication

ABSTRACT

References

Cited By

Index Terms

Recommendations

A novel approach for data stream maximal frequent itemsets mining

Interactive mining of frequent itemsets over arbitrary time intervals in a data stream

Mining maximal frequent itemsets from data streams

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Interactive stream mining of maximal frequent itemsets allowing flexible time intervals and support thresholds

ICUIMC '10: Proceedings of the 4th International Conference on Uniquitous Information Management and Communication

ABSTRACT

References

Cited By

Index Terms

Recommendations

A novel approach for data stream maximal frequent itemsets mining

Interactive mining of frequent itemsets over arbitrary time intervals in a data stream

Mining maximal frequent itemsets from data streams

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media