Toward an Ontology of Pattern Mining over Data Streams

Samb, Dame; Slimani, Yahya; Ndiaye, Samba

doi:10.1007/978-3-031-46335-8_12

Dame Samb⁸,
Yahya Slimani⁹ &
Samba Ndiaye¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1940))

Included in the following conference series:

International Conference on Intelligent Systems and Pattern Recognition

113 Accesses

Abstract

Pattern Mining over Data Stream (PMDS) is part of the most significant task in data mining. A major challenge is to define a representational framework that unifies PMDS algorithms dealing with different pattern types (frequent itemset, high-utility itemset, uncertain frequent itemset), using different methods (test-and-generate, pattern-growth, hybrid) and different window models (landmark, sliding, decay, tilted) in a uniform fashion. This will help standardize the process and create a better understanding of the algorithm design, provide a base for unification and research opportunities. It also facilitates the variability management and allows the derivation of tools for wide experimentation. In this publication, we propose a reference ontology to formalize the domain knowledge around PMDS. The design process of the ontology followed leading practices in ontology engineering. It is aligned to the most popular data mining and machine learning ontologies and thus, represents a major contribution toward PMDS domain ontologies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)
Article Google Scholar
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Article Google Scholar
Arp, R., Smith, B., Spear, A.D.: Building Ontologies with Basic Formal Ontology. MIT Press, Cambridge (2015)
Book Google Scholar
Bandrowski, A., et al.: The ontology for biomedical investigations. PLoS ONE 11(4), e0154556 (2016)
Article Google Scholar
Bayardo, R.J., Jr.: Efficiently mining long patterns from databases. ACM SIGMOD Rec. 27(2), 85–93 (1998)
Article Google Scholar
Bécan, G., Acher, M., Baudry, B., Nasr, S.B.: Breathing ontological knowledge into feature model synthesis: an empirical study. Empir. Softw. Eng. 21(4), 1794–1841 (2016)
Article Google Scholar
Benali, K., Rahal, S.A.: Ontodta: ontology-guided decision tree assistance. J. Inf. Knowl. Manag. 16(03), 1750031 (2017)
Article Google Scholar
Bernecker, T., Kriegel, H.P., Renz, M., Verhein, F., Zuefle, A.: Probabilistic frequent itemset mining in uncertain databases. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, pp. 119–128 (2009)
Google Scholar
Ceusters, W.: An information artifact ontology perspective on data collections and associated representational artifacts. In: Proceedings of the Medical Informatics in Europe Conference (MIE 2012), pp. 68–72 (2012)
Google Scholar
Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, pp. 487–492 (2003)
Google Scholar
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proceedings of the 4th IEEE International Conference on Data Mining, Brighton, UK, pp. 59–66 (2004)
Google Scholar
Chui, C.K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Nanjing, China, pp. 47–58 (2007)
Google Scholar
Czarnecki, K., Hwan, C., Kim, P., Kalleberg, K.: Feature models are views on ontologies. In: 10th International Software Product Line Conference (SPLC 2006), Baltimore, MD, USA, pp. 41–51 (2006)
Google Scholar
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. Next Gener. Data Mining 212, 191–212 (2003)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)
Article Google Scholar
Hilario, M., Nguyen, P., Do, H., Woznica, A., Kalousis, A.: Ontology-based meta-mining of knowledge discovery workflows. In: Meta-Learning in Computational Intelligence, pp. 273–315 (2011)
Google Scholar
Hong, T.P., Lee, C.H., Wang, S.L.: Mining high average-utility itemsets. In: IEEE International Conference on Systems Man and Cybernetics, San Antonio, TX, USA, pp. 2526–2530 (2009)
Google Scholar
Hong, T.P., Lin, C.W., Wu, Y.L.: Incrementally fast updated frequent pattern trees. Expert Syst. Appl. 34(4), 2424–2435 (2008)
Article Google Scholar
Jovanovska, L., Panov, P.: Semantic representation of machine learning and data mining algorithms. In: Proceedings of the 44th International Convention on Information, Communication and Electronic Technology (MIPRO), pp. 205–210 (2021)
Google Scholar
Keet, C.M., et al.: The data mining optimization ontology. J. Web Semant. 32, 43–53 (2015)
Article Google Scholar
Lee, V.E., Jin, R., Agrawal, G.: Frequent pattern mining in data streams. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 199–224. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07821-2_9
Chapter Google Scholar
Leung, C.K.S., Khan, Q.I.: Dstree: a tree structure for the mining of frequent sets from data streams. In: Proceedings of the 6th IEEE International Conference on Data Mining, Hong Kong, China, pp. 928–932 (2006)
Google Scholar
Li, H.F., Lee, S.Y.: Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst. Appl. 36(2), 1466–1477 (2009)
Article Google Scholar
Li, H.F., Lee, S.Y., Shan, M.K.: An efficient algorithm for mining frequent itemsets over the entire history of data streams. In: Proceedings of 1st International Workshop on Knowledge Discovery in Data Streams (2004)
Google Scholar
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Databases (VLDB), Hong Kong, China, pp. 346–357 (2002)
Google Scholar
Panov, P., Soldatova, L., Džeroski, S.: Ontology of core data mining entities. Data Mining Knowl. Discov. 28(4), 1222–1265 (2014). https://doi.org/10.1007/s10618-014-0363-0
Article Google Scholar
Panov, P., Soldatova, L.N., Džeroski, S.: Generic ontology of datatypes. Inf. Sci. 329, 900–920 (2016)
Article Google Scholar
Pei, J., Han, J., Mao, R., et al.: Closet: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, vol. 4, no. 2, pp. 21–30 (2000)
Google Scholar
Publio, G.C., et al.: ML-schema: exposing the semantics of machine learning with schemas and ontologies. arXiv preprint arXiv:1807.05351 (2018)
Smith, B., et al.: The obo foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25(11), 1251–1255 (2007)
Article Google Scholar
Smith, B., et al.: Relations in biomedical ontologies. Genome Biol. 6(5), 1–15 (2005)
Article Google Scholar
Vanschoren, J., Blockeel, H., Pfahringer, B., Holmes, G.: Experiment databases. Mach. Learn. 87(2), 127–158 (2012)
Article MathSciNet MATH Google Scholar
Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, pp. 482–486 (2004)
Google Scholar
Yu, J.X., Chong, Z., Lu, H., Zhou, A.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proceedings of the 30th International Conference on Very Large Data Bases, Toronto, Canada, pp. 204–215 (2004)
Google Scholar
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Management Department, UIDT, Cite Malick SY, N2, BP: A967, Thies, Senegal
Dame Samb
ISAMM, Tunis, Tunisia
Yahya Slimani
Mathematics and Computer Science Department, Cheikh Anta Diop University, Dakar, Senegal
Samba Ndiaye

Authors

Dame Samb
View author publications
You can also search for this author in PubMed Google Scholar
Yahya Slimani
View author publications
You can also search for this author in PubMed Google Scholar
Samba Ndiaye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dame Samb .

Editor information

Editors and Affiliations

Larbi Tebessi University, Tebessa, Algeria
Akram Bennour
Sharjah University, Sharjah, United Arab Emirates
Ahmed Bouridane
University of Toulouse, Toulouse, France
Lotfi Chaari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Samb, D., Slimani, Y., Ndiaye, S. (2024). Toward an Ontology of Pattern Mining over Data Streams. In: Bennour, A., Bouridane, A., Chaari, L. (eds) Intelligent Systems and Pattern Recognition. ISPR 2023. Communications in Computer and Information Science, vol 1940. Springer, Cham. https://doi.org/10.1007/978-3-031-46335-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-46335-8_12
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46334-1
Online ISBN: 978-3-031-46335-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Toward an Ontology of Pattern Mining over Data Streams