Skip to main content
Log in

Data mining-based approach for ontology matching problem

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Ontology matching aims at identifying the correspondences between instances and data properties of different ontologies. The use of data mining approach in matching ontology problem is reviewed in this article. We propose DMOM (Data Mining for Ontology Matching based instances) framework to select data properties of instances efficiently. The framework exploits data mining techniques to select the most appropriate features to match ontologies. Moreover, three strategies have been investigated to select the relevant features for the matching process. The first one called exhaustive, explores the enumerate search tree randomly by generating at each iteration a subset of feature attributes, where each node is evaluated by running the matching process on its selected attributes. The second approach called statistical, it uses some statistical values to select the most relevant properties. The third one called FIM (Frequent Itemsets Mining), it explores the correlation between different properties and selects the most frequent properties describing the overall instances of the given ontology. To demonstrate the usefulness of DMOM framework, several experiments have been carried out on OAEI (Ontology Alignment Evaluation Initiative) and DBpedia ontology databases. The results show that the third strategy, FIM, outperforms the two other strategies (Exhaustive, and Statistical). The results also reveal that DMOM outperforms the state-of-the-art ontology matching approaches in terms of execution time and the quality of the matching process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://wiki.dbpedia.org/Datasets

  2. http://oaei.ontologymatching.org

  3. http://oaei.ontologymatching.org

  4. http://wiki.dbpedia.org/Datasets

References

  1. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, et al. (2007) The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 25(11):1251

    Article  Google Scholar 

  2. Cerón-Figueroa S, López-Yáñez I, Alhalabi W, Camacho-Nieto O, Villuendas-Rey Y, Aldape-Pérez M, et al. (2017) Instance-based ontology matching for e-learning material using an associative pattern classifier. Comput Human Behav 69:218–225

    Article  Google Scholar 

  3. Iwata T, Kanagawa M, Hirao T, Fukumizu K (2017) Unsupervised group matching with application to cross-lingual topic matching without alignment information. Data Mining Knowl Discov 31(2):350–370

    Article  MathSciNet  Google Scholar 

  4. De Meo P, Quattrone G, Rosaci D, Ursino D, et al. (2012) Bilateral semantic negotiation: a decentralised approach to ontology enrichment in open multi-agent systems. IJDMMM 4(1):1–38

    Article  Google Scholar 

  5. Garruzzo S, Quattrone G, Rosaci D, Ursino D (2011) Improving agent interoperability via the automatic enrichment of multi-category ontologies. Web Intell Agent Syst: Int J 9(4):291–318

    Google Scholar 

  6. Del Vescovo C, Parsia B, Sattler U, Schneider T (2232) The modular structure of an ontology: atomic decomposition. In: IJCAI Proceedings-international joint conference on artificial intelligence, vol 22, p 2011

  7. Grau BC, Horrocks I, Kazakov Y, Sattler U (2008) Modular reuse of ontologies: theory and practice. J Artif Intell Res 31:273– 318

    Article  MathSciNet  Google Scholar 

  8. Grau BC, Parsia B, Sirin E, Kalyanpur A (2006) Modularity and web ontologies. In: KR, pp 198–209

  9. Xue X, Pan JS (2018) An overview on evolutionary algorithm based ontology matching. J Inf Hiding Multimed Signal Process 9:75–88

    Google Scholar 

  10. Acampora G, Loia V, Salerno S, Vitiello A (2012) A hybrid evolutionary approach for solving the ontology alignment problem. Int J Intell Syst 27(3):189–216

    Article  Google Scholar 

  11. Xue X, Liu J (2017) Collaborative ontology matching based on compact interactive evolutionary algorithm. Knowl-Based Syst 137:94–103

    Article  Google Scholar 

  12. Amin MB, Batool R, Khan WA, Lee S, Huh EN (2014) SPHeRe. J Supercomput 68(1):274–301

    Article  Google Scholar 

  13. Thayasivam U, Doshi P (2013) Speeding up batch alignment of large ontologies using MapReduce. In: 2013 IEEE seventh international conference on semantic computing (ICSC). IEEE, pp 110–113

  14. Ochieng P, Kyanda S (2018) A statistically-based ontology matching tool. Distrib Parallel Datab 36 (1):195–217

    Article  Google Scholar 

  15. Niu X, Rong S, Wang H, Yu Y (2012) An effective rule miner for instance matching in a web of data. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 1085–1094

  16. Shao C, Hu LM, Li JZ, Wang ZC, Chung T, Xia JB (2016) RiMOM-IM: a novel iterative framework for instance matching. J Comput Sci Technol 31(1):185–197

    Article  MathSciNet  Google Scholar 

  17. Djenouri Y, Belhadi A, Fournier-Viger P, Lin JCW (2018) Fast and effective cluster-based information retrieval using frequent closed itemsets. Inform Sci 453:154–167

    Article  MathSciNet  Google Scholar 

  18. Djenouri Y, Zimek A (2018) Outlier detection in urban traffic data. In: Proceedings of the 8th international conference on web intelligence, mining and semantics. ACM, p 3

  19. Djenouri Y, Djamel D, Djenoouri Z (2017) Data-mining-based decomposition for solving MAXSAT problem: towards a new approach. IEEE Intelligent Systems

  20. Shvaiko P, Euzenat J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25(1):158–176

    Article  Google Scholar 

  21. Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971

    Article  Google Scholar 

  22. Abubakar M, Hamdan H, Mustapha N, Aris TNM (2018) Instance-based ontology matching: a literature review. In: International conference on soft computing and data mining. Springer, pp 455–469

  23. Nentwig M, Hartung M, Ngonga Ngomo AC, Rahm E (2017) A survey of current link discovery frameworks. Semantic Web 8(3):419–436

    Article  Google Scholar 

  24. Heflin J, Song D (2016) Ontology instance linking: towards interlinked knowledge graphs. In: AAAI, pp 4163–4169

  25. Saïs F, Pernelle N, Rousset MC (2009) Combining a logical and a numerical method for data reconciliation. In: Journal on data semantics XII. Springer, pp 66–94

  26. Jean-Mary YR, Shironoshita EP, Kabuka MR (2009) Ontology matching with semantic verification. Web Semantics: Science, Services and Agents on the World Wide Web 7(3):235–251

    Article  Google Scholar 

  27. Noessner J, Niepert M, Meilicke C, Stuckenschmidt H (2010) Leveraging terminological structure for object reconciliation. In: Extended semantic web conference. Springer, pp 334–348

  28. Wang Z, Zhang X, Hou L, Zhao Y, Li J, Qi Y, et al. (2010) RiMOM results for OAEI 2010. Ontol Match, 195

  29. Suchanek FM, Abiteboul S, Senellart P (2011) Paris: probabilistic alignment of relations, instances, and schema. Proceed VLDB Endowm 5(3):157–168

    Article  Google Scholar 

  30. Lacoste-Julien S, Palla K, Davies A, Kasneci G, Graepel T, Ghahramani Z (2013) Sigma: simple greedy matching for aligning large knowledge bases. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 572–580

  31. Song D, Heflin J (2013) Domain-independent entity coreference for linking ontology instances. J Data Inf Qual (JDIQ) 4(2):7

    Google Scholar 

  32. Xue X, Wang Y (2016) Using memetic algorithm for instance coreference resolution. IEEE Trans Knowl Data Eng 28(2):580–591

    Article  Google Scholar 

  33. Song D, Heflin J (2011) Automatically generating data linkages using a domain-independent candidate selection approach. In: International semantic web conference. Springer, pp 649– 664

  34. Wang J, Li G, Feng J (2014) Extending string similarity join to tolerant fuzzy token matching. ACM Trans Datab Syst (TODS) 39(1):7

    MathSciNet  MATH  Google Scholar 

  35. Xiao C, Wang W, Lin X, Yu JX, Wang G (2011) Efficient similarity joins for near-duplicate detection. ACM Trans Datab Syst (TODS) 36(3):15

    Google Scholar 

  36. Xiao C, Wang W, Lin X (2008) Ed-join: an efficient algorithm for similarity joins with edit distance constraints. Proc VLDB Endow 1(1):933–944

    Article  MathSciNet  Google Scholar 

  37. Mohammadi M, Hofman W, Tan Y (2019) A comparative study of ontology matching systems via inferential statistics. IEEE Trans Knowl Data Eng 31(4):615–628

    Article  Google Scholar 

  38. Mohammadi M, Atashin AA, Hofman W, Tan Y (2018) Comparison of ontology alignment systems across single matching task via the McNemar’s test. ACM Trans Knowl Discov Data (TKDD) 12(4):51

    Google Scholar 

  39. Vidal JC, Rabelo T, Lama M, Amorim R (2018) Ontology-based approach for the validation and conformance testing of xAPI events. Knowl-Based Syst 155:22–34

    Article  Google Scholar 

  40. Petrović G, Soner FH (2016) Social network ranker. Neurocomputing 202:104–107

    Article  Google Scholar 

  41. Li J, Wang Z, Zhang X, Tang J (2013) Large scale instance matching via multiple indexes and candidate selection. Knowl-Based Syst 50:112–120

    Article  Google Scholar 

  42. Wang Z, Li J, Zhao Y, Setchi R, Tang J (2013) A unified approach to matching semantic data on the Web. Knowl-Based Syst 39:173–184

    Article  Google Scholar 

  43. Alam M, Recupero DR, Mongiovi M, Gangemi A, Ristoski P (2017) Event-based knowledge reconciliation using frame embeddings and frame similarity. Knowl-Based Syst 135:192–203

    Article  Google Scholar 

  44. Rosaci D (2007) CILIOS: connectionist inductive learning and inter-ontology similarities for recommending information agents. Inform Syst 32(6):793–825

    Article  Google Scholar 

  45. Rosaci D (2015) Finding semantic associations in hierarchically structured groups of Web data. Formal Aspects Comput 27(5-6):867–884

    Article  Google Scholar 

  46. Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: a survey. IEEE Trans Knowl Data Eng 19(1):1–16

    Article  Google Scholar 

  47. Ochieng P, Kyanda S (2018) A K-way spectral partitioning of an ontology for ontology matching. Distrib Parallel Datab, 1–31

  48. Tran QV, Ichise R, Ho BQ (2011) Cluster-based similarity aggregation for ontology matching. Ontol Match, 814

  49. Algergawy A, Massmann S, Rahm E (2011) A clustering-based approach for large-scale ontology matching. In: East European conference on advances in databases and information systems. Springer, pp 415–428

  50. Xue X, Liu J (2017) A compact hybrid evolutionary algorithm for large scale instance matching in linked open data cloud. Int J Artif Intell Tools 26(04):1750013

    Article  Google Scholar 

  51. Xue X, Chen J, Chen J, Chen D (2018) Using compact coevolutionary algorithm for matching biomedical ontologies. Comput Intell Neurosci, 2018

  52. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: ACM sigmod record, vol 22. ACM, pp 207–216

  53. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ACM sigmod record, vol 29. ACM, pp 1–12

  54. Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. Acm Sigmod Record 26(2):255–264

    Article  Google Scholar 

  55. Djenouri Y, Comuzzi M, Djenouri D (2017) SS-FIM: single scan for frequent itemsets mining in transactional databases. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 644–654

  56. Barron A, Rissanen J, Yu B (1998) The minimum description length principle in coding and modeling. IEEE Trans Inf Theory 44(6):2743–2760

    Article  MathSciNet  Google Scholar 

  57. Gouda K, Zaki MJ (2001) Efficiently mining maximal frequent itemsets. In: Proceedings 2001 IEEE international conference on data mining. IEEE, pp 163–170

  58. Pei J, Han J, Mao R et al (2000) Closet: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD workshop on research issues in data mining and knowledge discovery, vol 4, pp 21–30

  59. Hosseini S, Kalam S, Barker K, Ramirez-Marquez JE (2019) Scheduling multi-component maintenance with a greedy heuristic local search algorithm. Soft Comput, 1–16

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiba Belhadi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Belhadi, H., Akli-Astouati, K., Djenouri, Y. et al. Data mining-based approach for ontology matching problem. Appl Intell 50, 1204–1221 (2020). https://doi.org/10.1007/s10489-019-01593-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01593-3

Keywords

Navigation