Abstract
Ontology matching aims at identifying the correspondences between instances and data properties of different ontologies. The use of data mining approach in matching ontology problem is reviewed in this article. We propose DMOM (Data Mining for Ontology Matching based instances) framework to select data properties of instances efficiently. The framework exploits data mining techniques to select the most appropriate features to match ontologies. Moreover, three strategies have been investigated to select the relevant features for the matching process. The first one called exhaustive, explores the enumerate search tree randomly by generating at each iteration a subset of feature attributes, where each node is evaluated by running the matching process on its selected attributes. The second approach called statistical, it uses some statistical values to select the most relevant properties. The third one called FIM (Frequent Itemsets Mining), it explores the correlation between different properties and selects the most frequent properties describing the overall instances of the given ontology. To demonstrate the usefulness of DMOM framework, several experiments have been carried out on OAEI (Ontology Alignment Evaluation Initiative) and DBpedia ontology databases. The results show that the third strategy, FIM, outperforms the two other strategies (Exhaustive, and Statistical). The results also reveal that DMOM outperforms the state-of-the-art ontology matching approaches in terms of execution time and the quality of the matching process.
Similar content being viewed by others
References
Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, et al. (2007) The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 25(11):1251
Cerón-Figueroa S, López-Yáñez I, Alhalabi W, Camacho-Nieto O, Villuendas-Rey Y, Aldape-Pérez M, et al. (2017) Instance-based ontology matching for e-learning material using an associative pattern classifier. Comput Human Behav 69:218–225
Iwata T, Kanagawa M, Hirao T, Fukumizu K (2017) Unsupervised group matching with application to cross-lingual topic matching without alignment information. Data Mining Knowl Discov 31(2):350–370
De Meo P, Quattrone G, Rosaci D, Ursino D, et al. (2012) Bilateral semantic negotiation: a decentralised approach to ontology enrichment in open multi-agent systems. IJDMMM 4(1):1–38
Garruzzo S, Quattrone G, Rosaci D, Ursino D (2011) Improving agent interoperability via the automatic enrichment of multi-category ontologies. Web Intell Agent Syst: Int J 9(4):291–318
Del Vescovo C, Parsia B, Sattler U, Schneider T (2232) The modular structure of an ontology: atomic decomposition. In: IJCAI Proceedings-international joint conference on artificial intelligence, vol 22, p 2011
Grau BC, Horrocks I, Kazakov Y, Sattler U (2008) Modular reuse of ontologies: theory and practice. J Artif Intell Res 31:273– 318
Grau BC, Parsia B, Sirin E, Kalyanpur A (2006) Modularity and web ontologies. In: KR, pp 198–209
Xue X, Pan JS (2018) An overview on evolutionary algorithm based ontology matching. J Inf Hiding Multimed Signal Process 9:75–88
Acampora G, Loia V, Salerno S, Vitiello A (2012) A hybrid evolutionary approach for solving the ontology alignment problem. Int J Intell Syst 27(3):189–216
Xue X, Liu J (2017) Collaborative ontology matching based on compact interactive evolutionary algorithm. Knowl-Based Syst 137:94–103
Amin MB, Batool R, Khan WA, Lee S, Huh EN (2014) SPHeRe. J Supercomput 68(1):274–301
Thayasivam U, Doshi P (2013) Speeding up batch alignment of large ontologies using MapReduce. In: 2013 IEEE seventh international conference on semantic computing (ICSC). IEEE, pp 110–113
Ochieng P, Kyanda S (2018) A statistically-based ontology matching tool. Distrib Parallel Datab 36 (1):195–217
Niu X, Rong S, Wang H, Yu Y (2012) An effective rule miner for instance matching in a web of data. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 1085–1094
Shao C, Hu LM, Li JZ, Wang ZC, Chung T, Xia JB (2016) RiMOM-IM: a novel iterative framework for instance matching. J Comput Sci Technol 31(1):185–197
Djenouri Y, Belhadi A, Fournier-Viger P, Lin JCW (2018) Fast and effective cluster-based information retrieval using frequent closed itemsets. Inform Sci 453:154–167
Djenouri Y, Zimek A (2018) Outlier detection in urban traffic data. In: Proceedings of the 8th international conference on web intelligence, mining and semantics. ACM, p 3
Djenouri Y, Djamel D, Djenoouri Z (2017) Data-mining-based decomposition for solving MAXSAT problem: towards a new approach. IEEE Intelligent Systems
Shvaiko P, Euzenat J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25(1):158–176
Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971
Abubakar M, Hamdan H, Mustapha N, Aris TNM (2018) Instance-based ontology matching: a literature review. In: International conference on soft computing and data mining. Springer, pp 455–469
Nentwig M, Hartung M, Ngonga Ngomo AC, Rahm E (2017) A survey of current link discovery frameworks. Semantic Web 8(3):419–436
Heflin J, Song D (2016) Ontology instance linking: towards interlinked knowledge graphs. In: AAAI, pp 4163–4169
Saïs F, Pernelle N, Rousset MC (2009) Combining a logical and a numerical method for data reconciliation. In: Journal on data semantics XII. Springer, pp 66–94
Jean-Mary YR, Shironoshita EP, Kabuka MR (2009) Ontology matching with semantic verification. Web Semantics: Science, Services and Agents on the World Wide Web 7(3):235–251
Noessner J, Niepert M, Meilicke C, Stuckenschmidt H (2010) Leveraging terminological structure for object reconciliation. In: Extended semantic web conference. Springer, pp 334–348
Wang Z, Zhang X, Hou L, Zhao Y, Li J, Qi Y, et al. (2010) RiMOM results for OAEI 2010. Ontol Match, 195
Suchanek FM, Abiteboul S, Senellart P (2011) Paris: probabilistic alignment of relations, instances, and schema. Proceed VLDB Endowm 5(3):157–168
Lacoste-Julien S, Palla K, Davies A, Kasneci G, Graepel T, Ghahramani Z (2013) Sigma: simple greedy matching for aligning large knowledge bases. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 572–580
Song D, Heflin J (2013) Domain-independent entity coreference for linking ontology instances. J Data Inf Qual (JDIQ) 4(2):7
Xue X, Wang Y (2016) Using memetic algorithm for instance coreference resolution. IEEE Trans Knowl Data Eng 28(2):580–591
Song D, Heflin J (2011) Automatically generating data linkages using a domain-independent candidate selection approach. In: International semantic web conference. Springer, pp 649– 664
Wang J, Li G, Feng J (2014) Extending string similarity join to tolerant fuzzy token matching. ACM Trans Datab Syst (TODS) 39(1):7
Xiao C, Wang W, Lin X, Yu JX, Wang G (2011) Efficient similarity joins for near-duplicate detection. ACM Trans Datab Syst (TODS) 36(3):15
Xiao C, Wang W, Lin X (2008) Ed-join: an efficient algorithm for similarity joins with edit distance constraints. Proc VLDB Endow 1(1):933–944
Mohammadi M, Hofman W, Tan Y (2019) A comparative study of ontology matching systems via inferential statistics. IEEE Trans Knowl Data Eng 31(4):615–628
Mohammadi M, Atashin AA, Hofman W, Tan Y (2018) Comparison of ontology alignment systems across single matching task via the McNemar’s test. ACM Trans Knowl Discov Data (TKDD) 12(4):51
Vidal JC, Rabelo T, Lama M, Amorim R (2018) Ontology-based approach for the validation and conformance testing of xAPI events. Knowl-Based Syst 155:22–34
Petrović G, Soner FH (2016) Social network ranker. Neurocomputing 202:104–107
Li J, Wang Z, Zhang X, Tang J (2013) Large scale instance matching via multiple indexes and candidate selection. Knowl-Based Syst 50:112–120
Wang Z, Li J, Zhao Y, Setchi R, Tang J (2013) A unified approach to matching semantic data on the Web. Knowl-Based Syst 39:173–184
Alam M, Recupero DR, Mongiovi M, Gangemi A, Ristoski P (2017) Event-based knowledge reconciliation using frame embeddings and frame similarity. Knowl-Based Syst 135:192–203
Rosaci D (2007) CILIOS: connectionist inductive learning and inter-ontology similarities for recommending information agents. Inform Syst 32(6):793–825
Rosaci D (2015) Finding semantic associations in hierarchically structured groups of Web data. Formal Aspects Comput 27(5-6):867–884
Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: a survey. IEEE Trans Knowl Data Eng 19(1):1–16
Ochieng P, Kyanda S (2018) A K-way spectral partitioning of an ontology for ontology matching. Distrib Parallel Datab, 1–31
Tran QV, Ichise R, Ho BQ (2011) Cluster-based similarity aggregation for ontology matching. Ontol Match, 814
Algergawy A, Massmann S, Rahm E (2011) A clustering-based approach for large-scale ontology matching. In: East European conference on advances in databases and information systems. Springer, pp 415–428
Xue X, Liu J (2017) A compact hybrid evolutionary algorithm for large scale instance matching in linked open data cloud. Int J Artif Intell Tools 26(04):1750013
Xue X, Chen J, Chen J, Chen D (2018) Using compact coevolutionary algorithm for matching biomedical ontologies. Comput Intell Neurosci, 2018
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: ACM sigmod record, vol 22. ACM, pp 207–216
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ACM sigmod record, vol 29. ACM, pp 1–12
Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. Acm Sigmod Record 26(2):255–264
Djenouri Y, Comuzzi M, Djenouri D (2017) SS-FIM: single scan for frequent itemsets mining in transactional databases. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 644–654
Barron A, Rissanen J, Yu B (1998) The minimum description length principle in coding and modeling. IEEE Trans Inf Theory 44(6):2743–2760
Gouda K, Zaki MJ (2001) Efficiently mining maximal frequent itemsets. In: Proceedings 2001 IEEE international conference on data mining. IEEE, pp 163–170
Pei J, Han J, Mao R et al (2000) Closet: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD workshop on research issues in data mining and knowledge discovery, vol 4, pp 21–30
Hosseini S, Kalam S, Barker K, Ramirez-Marquez JE (2019) Scheduling multi-component maintenance with a greedy heuristic local search algorithm. Soft Comput, 1–16
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Belhadi, H., Akli-Astouati, K., Djenouri, Y. et al. Data mining-based approach for ontology matching problem. Appl Intell 50, 1204–1221 (2020). https://doi.org/10.1007/s10489-019-01593-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01593-3