Skip to main content

Minits-AllOcc: An Efficient Algorithm for Mining Timed Sequential Patterns

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12712))

Included in the following conference series:

  • 3583 Accesses

Abstract

Sequential pattern mining aims to find the subsequences in a sequence database that appear together in the order of timestamps. Although there exist sequential pattern mining techniques, they ignore the temporal relationship information between the itemsets in the subsequences. This information is important in many real-world applications. For example, even if healthcare providers know that symptom Y frequently occurs after symptom X, it is also valuable for them to be able to estimate when Y will occur after X so that they can provide treatment at the right time. Considering temporal relationship information for sequential pattern mining raises new issues to be solved, such as designing a new data structure to save this information and traversing this structure efficiently to discover patterns without re-scanning the database. In this paper, we propose an algorithm called Minits-AllOcc (MINIng Timed Sequential Pattern for All-time Occurrences) to find sequential patterns and the transition time between itemsets based on all possible occurrences of a pattern in the database. We also propose a parallel multicore CPU version of this algorithm, called MMinits-AllOcc (Multicore Minits-AllOcc), to deal with Big Data. Extensive experiments on real and synthetic datasets show the advantages of this approach over the brute-force method. Also, the multicore CPU version of the algorithm is shown to outperform the single-core version on Big Data by 2.5X.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th IEEE International Conference, Taiwan, pp. 3–14. IEEE (1995)

    Google Scholar 

  2. AlZahrani, M.Y., Mazarbhuiya, F.A. Discovering constraint-based sequential patterns from medical datasets. Int. J. Recent Tech. Eng. (IJRTE) (2019). ISSN 2277-3878

    Google Scholar 

  3. Chen, Y.L., Chiang, M.C., Ko, M.T.: Discovering time-interval sequential patterns in sequence databases. Expert Syst. Appl. 25(3), 343–354 (2003)

    Article  Google Scholar 

  4. Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8

  5. Fournier-Viger, P., Lin, J.C.W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recognit. 1(1), 54–77 (2017)

    Google Scholar 

  6. Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Yu, P.S.: A survey of parallel sequential pattern mining. ACM Trans. Knowl. Discov. Data (TKDD) 13(3), 1–34 (2019)

    Article  Google Scholar 

  7. Giannotti, F., Nanni, M., Pedreschi, D.: Efficient mining of temporally annotated sequences. In: Proceedings of the 2006 SIAM International Conference on Data Mining, pp. 348–359 (2006)

    Google Scholar 

  8. Giannotti, F., Nanni, M., Pinelli, F., Pedreschi, D.: Trajectory pattern mining. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 330–339 (2007)

    Google Scholar 

  9. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.-C.: FreeSpan: frequent pattern-projected sequential pattern mining. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 355–359 (2000)

    Google Scholar 

  10. Hu, Y.H., Huang, T.C.K., Yang, H.R., Chen, Y.L.: On mining multi-time-interval sequential patterns. Data Knowl. Eng. 68(10), 1112–1127 (2009)

    Article  Google Scholar 

  11. Huynh, B., Vo, B., Snasel, V.: An efficient method for mining frequent sequential patterns using multi-core processors. Appl. Intell. 46(3), 703–716 (2017)

    Article  Google Scholar 

  12. Jay, N., Herengt, G., Albuisson, E., Kohler, F., Napoli, A.: Sequential pattern mining and classification of patient path. Medinfo 1667 (2004)

    Google Scholar 

  13. Yuan, J., et al.: T-drive: driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS 2010, pp. 99–108. ACM, New York (2010)

    Google Scholar 

  14. Yuan, J., Zheng, Y., Xie, X., Sun, G.: Driving with knowledge from the physical world. In: The 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011. ACM, New York (2011)

    Google Scholar 

  15. Karsoum, S., Gruenwald, L., Barrus, C., Leal, E.: Using timed sequential patterns in the transportation industry. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 3573–3582. IEEE, December 2019

    Google Scholar 

  16. Li, H., Zhou, X., Pan, C.: Study on GSP algorithm based on Hadoop. In: 5th IEEE International Conference Electronics Information and Emergency Communication (ICEIEC), pp. 321–324 (2015)

    Google Scholar 

  17. Pei, J., et al.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: ICCCN, p. 0215 (2001)

    Google Scholar 

  18. Pramono, Y.W.T.: Anomaly-based intrusion detection and prevention system on website usage using rule-growth sequential pattern analysis: case study: statistics of Indonesia (BPS) website. In: 2014 International Conference of Advanced Informatics: Concept, Theory, and Application (ICAICTA), pp. 203–208. IEEE, August 2014

    Google Scholar 

  19. Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol 1057, pp. 1–17. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014140

    Chapter  Google Scholar 

  20. Wei, Y.-Q., Liu, D., Duan, L.-S.: Distributed prefixspan algorithm based on mapreduce. In: International Symposium on IEEE Information Technology in Medicine and Education (ITME), vol. 2, pp. 901–904 (2012)

    Google Scholar 

  21. Yang, H., Gruenwald, L., Boulanger, M.: A novel real-time framework for extracting patterns from trajectory data streams. In: Proceedings of the 4th ACM SIGSPATIAL International Workshop on GeoStreaming, pp. 26–32 (2013)

    Google Scholar 

  22. Zaki, M.: SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Somayah Karsoum .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karsoum, S., Barrus, C., Gruenwald, L., Leal, E. (2021). Minits-AllOcc: An Efficient Algorithm for Mining Timed Sequential Patterns. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12712. Springer, Cham. https://doi.org/10.1007/978-3-030-75762-5_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75762-5_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75761-8

  • Online ISBN: 978-3-030-75762-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics