Abstract
Sequential Pattern Mining, briefly SPM, is an interesting issue in Data Mining that can be applied for temporal or time series data. This paper is related to SPM algorithms that can work with stream data. We present three new stream SPM methods, called SS-BE2, SS-LC and SS-LC2, which are the extensions of SS-BE. The proposed methods, similarly to SS-BE, are dealing with fixed-sized batches using PrefixSpan algorithm, and the critical problem in each step is how to store the huge amount of candidate patterns, and how to select the frequent patterns properly. The main idea of based on improving the tree pruning method of the original SS-BE to guarantee the high completeness and correctness of the result. In all experiments performed on benchmark data, the proposed solutions outperform the original SS-BE algorithm. Moreover, the proposed algorithms seems to be scalable, as the usage of memory is linearly depended on the number of patterns, and the size of the buffer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhao, Q., Bhowmick, S.S.: Sequential pattern mining: A survey. ITechnical Report CAIS Nayang Technological University Singapore, pp. 1-26 (2003)
Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 1–17. Springer, Heidelberg (1996)
Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE Trans. Knowl. Data Eng., 1424–1440 (2004)
Ezeife, C.I., Lu, Y., Liu, Y.: PLWAP Sequential Mining: Open Source Code. In: Proc. of the 1st Int. Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations (OSDM 2005), pp. 26–35. ACM, New York (2010)
Cheng, H., Yan, X., Han, J.: IncSpan: incremental mining of sequential patterns in large database. In: Proc. of the 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2004), pp. 527–532. ACM, New York (2004)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining, pp. 191–212. AAAI/MIT (2003)
Marascu, A., Masseglia, F.: Mining Sequential Patterns from Temporal Streaming Data. In: Proc. of the 1st ECML/PKDD Workshop on Mining Spatio-Temporal Data (MSTD 2005), pp. 1–13 (2005)
Ezeife, C.I., Monwar, M.: SSM: A Frequent Sequential Data Stream Patterns Miner. In: IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2007, pp. 120–126 (2007)
Mendes, L.F., Ding, B., Han, J.: Stream Sequential Pattern Mining with Precise Error Bounds. In: Proceedings of ICDM 2008, pp. 941–946 (2008)
Wojnarski, M.: Debellor: A Data Mining Platform with Stream Architecture. In: Peters, J.F., Skowron, A., Rybiński, H. (eds.) Transactions on Rough Sets IX. LNCS, vol. 5390, pp. 405–427. Springer, Heidelberg (2008)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. of ICDE 1995, pp. 3–14 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koper, A., Nguyen, H.S. (2011). Sequential Pattern Mining from Stream Data. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25856-5_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-25856-5_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25855-8
Online ISBN: 978-3-642-25856-5
eBook Packages: Computer ScienceComputer Science (R0)