Skip to main content

Mining Frequent Closed Sequential Patterns with Non-user-defined Gap Constraints

  • Conference paper
Advanced Data Mining and Applications (ADMA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8933))

Included in the following conference series:

Abstract

Frequent closed sequential pattern mining plays an important role in sequence data mining and has a wide range of applications in real life, such as protein sequence analysis, financial data investigation, and user behavior prediction. In previous studies, a user predefined gap constraint is considered in frequent closed sequential pattern mining as a parameter. However, it is difficult for users, who are lacking sufficient priori knowledge, to set suitable gap constraints. Furthermore, different gap constraints may lead to different results, and some useful patterns may be missed if the gap constraint is chosen inappropriately. To deal with this, we present a novel problem of mining frequent closed sequential patterns with non-user-defined gap constraints. In addition, we propose an efficient algorithm to find the frequent closed sequential patterns with the most suitable gap constraints. Our empirical study on protein data sets demonstrates that our algorithm is effective and efficient.

This work was supported in part by NSFC 61103042, SRFDP 20100181120029, SKLSE2012-09-32, and China Postdoctoral Science Foundation 2014M552371.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. of the 11th Int’l Conf. on Data Engineering, Taipei, Taiwan, pp. 3–14 (1995)

    Google Scholar 

  2. Zaki, M.J.: SPADE: An efficient algorithm for mining frequent sequences. Mach. Learn. 42(1-2), 31–60 (2001)

    Article  MATH  Google Scholar 

  3. Ji, X., Bailey, J., Dong, G.: Mining minimal distinguishing subsequence patterns with gap constraints. Knowl. Inf. Syst. 11(3), 259–286 (2007)

    Article  Google Scholar 

  4. Yan, X., Han, J., Afshar, R.: CloSpan: Mining closed sequential patterns in large databases. In: Proc. of the 3rd SIAM Int’l Conf. on Data Mining, San Francisco, CA, USA, pp. 166–177 (2003)

    Google Scholar 

  5. Zhang, M., Kao, B., Cheung, D.W., Yip, K.Y.: Mining periodic patterns with gap requirement from sequences. ACM Trans. Knowl. Discov. Data 1(2) (August 2007)

    Google Scholar 

  6. Ferreira, P.G., Azevedo, P.J.: Protein sequence pattern mining with constraints. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 96–107. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. She, R., Chen, F., Wang, K., Ester, M., Gardy, J.L., Brinkman, F.S.L.: Frequent-subsequence-based prediction of outer membrane proteins. In: Proc. of the 9th ACM SIGKDD Int’l Conf. on Knowl. Discov. and Data Mining, pp. 436–445. ACM, New York (2003)

    Google Scholar 

  8. Wang, J., Han, J., Li, C.: Frequent closed sequence mining without candidate maintenance. IEEE Trans. on Knowl. and Data Engineering 19(8), 1042–1056 (2007)

    Article  MathSciNet  Google Scholar 

  9. Li, C., Yang, Q., Wang, J., Li, M.: Efficient mining of gap-constrained subsequences and its various applications. ACM Trans. Knowl. Discov. Data 6(1), 2:1–2:39 (2012)

    Google Scholar 

  10. He, H., Wang, D., Chen, G., Zhang, W.: An alert correlation analysis oriented incremental mining algorithm of closed sequential patterns with gap constraints. Appl. Math 8(1L), 41–46 (2014)

    Google Scholar 

  11. Wu, X., Zhu, X., He, Y., Arslan, A.N.: PMBC: Pattern mining from biological sequences with wildcard constraints. Comput. Biol. Med. 43(5), 481–492 (2013)

    Article  Google Scholar 

  12. Xie, F., Wu, X., Hu, X., Gao, J., Guo, D., Fei, Y., Hua, E.: MAIL: Mining sequential patterns with wildcards. Int. J. Data Min. Bioinformatics 8(1), 1–23 (2013)

    Article  Google Scholar 

  13. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)

    Book  MATH  Google Scholar 

  14. Antunes, C., Oliveira, A.L.: Generalization of pattern-growth methods for sequential pattern mining with gap constraints. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNCS, vol. 2734, pp. 239–251. Springer, Heidelberg (2003)

    Google Scholar 

  15. Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-c.: PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proc. of the 17th Int’l Conf. on Data Engineering, ICDE 2001, pp. 215–224. IEEE Computer Society, Washington, DC (2001)

    Google Scholar 

  16. Shah, C.C., Zhu, X., Khoshgoftaar, T.M., Beyer, J.: Contrast pattern mining with gap constraints for peptide folding prediction. In: Proc. of the 21st Int’l FLAIRS Conf., Coconut Grove, Florida, USA, pp. 95–100 (2008)

    Google Scholar 

  17. Wang, X., Duan, L., Dong, G., Yu, Z., Tang, C.: Efficient mining of density-aware distinguishing sequential patterns with gap constraints. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part I. LNCS, vol. 8421, pp. 372–387. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  18. Rymon, R.: Search through systematic set enumeration. In: Proc. of the 3rd Int’l Conf. on Principles of Knowl. Representation and Reasoning, pp. 539–550. Cambridge (1992)

    Google Scholar 

  19. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, W. et al. (2014). Mining Frequent Closed Sequential Patterns with Non-user-defined Gap Constraints. In: Luo, X., Yu, J.X., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2014. Lecture Notes in Computer Science(), vol 8933. Springer, Cham. https://doi.org/10.1007/978-3-319-14717-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14717-8_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14716-1

  • Online ISBN: 978-3-319-14717-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics