Skip to main content

Toward an Ontology of Pattern Mining over Data Streams

  • Conference paper
  • First Online:
Intelligent Systems and Pattern Recognition (ISPR 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1940))

  • 113 Accesses

Abstract

Pattern Mining over Data Stream (PMDS) is part of the most significant task in data mining. A major challenge is to define a representational framework that unifies PMDS algorithms dealing with different pattern types (frequent itemset, high-utility itemset, uncertain frequent itemset), using different methods (test-and-generate, pattern-growth, hybrid) and different window models (landmark, sliding, decay, tilted) in a uniform fashion. This will help standardize the process and create a better understanding of the algorithm design, provide a base for unification and research opportunities. It also facilitates the variability management and allows the derivation of tools for wide experimentation. In this publication, we propose a reference ontology to formalize the domain knowledge around PMDS. The design process of the ontology followed leading practices in ontology engineering. It is aligned to the most popular data mining and machine learning ontologies and thus, represents a major contribution toward PMDS domain ontologies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://protege.stanford.edu/.

  2. 2.

    https://neo4j.com/download/.

  3. 3.

    https://neo4j.com/labs/neosemantics/.

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)

    Article  Google Scholar 

  2. Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)

    Article  Google Scholar 

  3. Arp, R., Smith, B., Spear, A.D.: Building Ontologies with Basic Formal Ontology. MIT Press, Cambridge (2015)

    Book  Google Scholar 

  4. Bandrowski, A., et al.: The ontology for biomedical investigations. PLoS ONE 11(4), e0154556 (2016)

    Article  Google Scholar 

  5. Bayardo, R.J., Jr.: Efficiently mining long patterns from databases. ACM SIGMOD Rec. 27(2), 85–93 (1998)

    Article  Google Scholar 

  6. Bécan, G., Acher, M., Baudry, B., Nasr, S.B.: Breathing ontological knowledge into feature model synthesis: an empirical study. Empir. Softw. Eng. 21(4), 1794–1841 (2016)

    Article  Google Scholar 

  7. Benali, K., Rahal, S.A.: Ontodta: ontology-guided decision tree assistance. J. Inf. Knowl. Manag. 16(03), 1750031 (2017)

    Article  Google Scholar 

  8. Bernecker, T., Kriegel, H.P., Renz, M., Verhein, F., Zuefle, A.: Probabilistic frequent itemset mining in uncertain databases. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, pp. 119–128 (2009)

    Google Scholar 

  9. Ceusters, W.: An information artifact ontology perspective on data collections and associated representational artifacts. In: Proceedings of the Medical Informatics in Europe Conference (MIE 2012), pp. 68–72 (2012)

    Google Scholar 

  10. Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, pp. 487–492 (2003)

    Google Scholar 

  11. Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proceedings of the 4th IEEE International Conference on Data Mining, Brighton, UK, pp. 59–66 (2004)

    Google Scholar 

  12. Chui, C.K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Nanjing, China, pp. 47–58 (2007)

    Google Scholar 

  13. Czarnecki, K., Hwan, C., Kim, P., Kalleberg, K.: Feature models are views on ontologies. In: 10th International Software Product Line Conference (SPLC 2006), Baltimore, MD, USA, pp. 41–51 (2006)

    Google Scholar 

  14. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. Next Gener. Data Mining 212, 191–212 (2003)

    Google Scholar 

  15. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)

    Article  Google Scholar 

  16. Hilario, M., Nguyen, P., Do, H., Woznica, A., Kalousis, A.: Ontology-based meta-mining of knowledge discovery workflows. In: Meta-Learning in Computational Intelligence, pp. 273–315 (2011)

    Google Scholar 

  17. Hong, T.P., Lee, C.H., Wang, S.L.: Mining high average-utility itemsets. In: IEEE International Conference on Systems Man and Cybernetics, San Antonio, TX, USA, pp. 2526–2530 (2009)

    Google Scholar 

  18. Hong, T.P., Lin, C.W., Wu, Y.L.: Incrementally fast updated frequent pattern trees. Expert Syst. Appl. 34(4), 2424–2435 (2008)

    Article  Google Scholar 

  19. Jovanovska, L., Panov, P.: Semantic representation of machine learning and data mining algorithms. In: Proceedings of the 44th International Convention on Information, Communication and Electronic Technology (MIPRO), pp. 205–210 (2021)

    Google Scholar 

  20. Keet, C.M., et al.: The data mining optimization ontology. J. Web Semant. 32, 43–53 (2015)

    Article  Google Scholar 

  21. Lee, V.E., Jin, R., Agrawal, G.: Frequent pattern mining in data streams. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 199–224. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07821-2_9

    Chapter  Google Scholar 

  22. Leung, C.K.S., Khan, Q.I.: Dstree: a tree structure for the mining of frequent sets from data streams. In: Proceedings of the 6th IEEE International Conference on Data Mining, Hong Kong, China, pp. 928–932 (2006)

    Google Scholar 

  23. Li, H.F., Lee, S.Y.: Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst. Appl. 36(2), 1466–1477 (2009)

    Article  Google Scholar 

  24. Li, H.F., Lee, S.Y., Shan, M.K.: An efficient algorithm for mining frequent itemsets over the entire history of data streams. In: Proceedings of 1st International Workshop on Knowledge Discovery in Data Streams (2004)

    Google Scholar 

  25. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Databases (VLDB), Hong Kong, China, pp. 346–357 (2002)

    Google Scholar 

  26. Panov, P., Soldatova, L., Džeroski, S.: Ontology of core data mining entities. Data Mining Knowl. Discov. 28(4), 1222–1265 (2014). https://doi.org/10.1007/s10618-014-0363-0

    Article  Google Scholar 

  27. Panov, P., Soldatova, L.N., Džeroski, S.: Generic ontology of datatypes. Inf. Sci. 329, 900–920 (2016)

    Article  Google Scholar 

  28. Pei, J., Han, J., Mao, R., et al.: Closet: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, vol. 4, no. 2, pp. 21–30 (2000)

    Google Scholar 

  29. Publio, G.C., et al.: ML-schema: exposing the semantics of machine learning with schemas and ontologies. arXiv preprint arXiv:1807.05351 (2018)

  30. Smith, B., et al.: The obo foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25(11), 1251–1255 (2007)

    Article  Google Scholar 

  31. Smith, B., et al.: Relations in biomedical ontologies. Genome Biol. 6(5), 1–15 (2005)

    Article  Google Scholar 

  32. Vanschoren, J., Blockeel, H., Pfahringer, B., Holmes, G.: Experiment databases. Mach. Learn. 87(2), 127–158 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  33. Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, pp. 482–486 (2004)

    Google Scholar 

  34. Yu, J.X., Chong, Z., Lu, H., Zhou, A.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proceedings of the 30th International Conference on Very Large Data Bases, Toronto, Canada, pp. 204–215 (2004)

    Google Scholar 

  35. Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dame Samb .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Samb, D., Slimani, Y., Ndiaye, S. (2024). Toward an Ontology of Pattern Mining over Data Streams. In: Bennour, A., Bouridane, A., Chaari, L. (eds) Intelligent Systems and Pattern Recognition. ISPR 2023. Communications in Computer and Information Science, vol 1940. Springer, Cham. https://doi.org/10.1007/978-3-031-46335-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46335-8_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46334-1

  • Online ISBN: 978-3-031-46335-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics