Abstract
The purely manual specification of semantic correspondences between schemas is almost infeasible for very large schemas or when many different schemas have to be matched. Hence, solving such large-scale match tasks asks for automatic or semiautomatic schema matching approaches. Large-scale matching needs especially to be supported for XML schemas and different kinds of ontologies due to their increasing use and size, e.g., in e-business and web and life science applications. Unfortunately, correctly and efficiently matching large schemas and ontologies are very challenging, and most previous match systems have only addressed small match tasks. We provide an overview about recently proposed approaches to achieve high match quality or/and high efficiency for large-scale matching. In addition to describing some recent matchers utilizing instance and usage data, we cover approaches on early pruning of the search space, divide and conquer strategies, parallel matching, tuning matcher combinations, the reuse of previous match results, and holistic schema matching. We also provide a brief comparison of selected match tools.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
F-Measure combines Recall and Precision, two standard measures to evaluate the effectiveness of schema matching approaches (Do et al. 2003).
References
Alexe B, Gubanov M, Hernandez MA, Ho H, Huang JW, Katsis Y, Popa L, Saha B, Stanoi I (2009) Simplifying information integration: Object-based flow-of-mappings framework for integration. In: Proceedings of BIRTE08 (business intelligence for the real-time enterprise) workshop. Lecture Notes in Business Information Processing, vol 27. Springer, Heidelberg, pp 108–121
Algergawy A, Schallehn E, Saake G (2009) Improving XML schema matching performance using Prüfer sequences. Data Knowl Eng 68(8):728–747
Algergawy A et al. (2010) Combining schema and level-based matching for web service discovery. In: Proceedings of 10th international conference on web engineering (ICWE). Lecture Notes in Computer Science, vol 6189. Springer, Heidelberg, pp 114–128
Aumueller D, Do HH, Massmann S, Rahm E (2005) Schema and ontology matching with COMA + + . In: Proceedings of ACM SIGMOD conference, demo paper. ACM, NY, pp 906–908
Avesani P, Giunchiglia F, Yatskevich M (2005) A large scale taxonomy mapping evaluation. In: Proceedings of international conference on semantic web (ICSW). LNCS, vol 3729. Springer, Heidelberg, pp 67–81
Bellahsene Z, Duchateau F (2011) Tuning for schema matching. In: Bellahsene Z, Bonifati A, Rahm E (eds) Schema matching and mapping, Data-Centric Systems and Applications Series. Springer, Heidelberg
Bellahsene Z, Bonifati A, Duchateau F, Velegrakis Y (2011) On evaluating schema matching and mapping. In: Bellahsene Z, Bonifati A, Rahm E (eds) Schema matching and mapping, Data-Centric Systems and Applications Series. Springer, Heidelberg
Bernstein PA, Melnik S, Petropoulos M, Quix C (2004) Industrial-strength schema matching. ACM SIGMOD Rec 33(4):38–43
Bernstein PA, Melnik S, Churchill JE (2006) Incremental schema matching. In: Proceedings of VLDB, demo paper. VLDB Endowment, pp 1167–1170
Chen K, Madhavan J, Halevy AY (2009) Exploring schema repositories with Schemr. In: Proceedings of ACM SIGMOD Conference, demo paper. ACM, NY, pp 1095–1098
Cruz IF, Antonelli FP, Stroe C (2009) AgreementMaker: Efficient matching for large real-world schemas and ontologies. In: PVLDB, vol 2(2), demo paper. VLDB Endowment, pp 1586–1589
Das Sarma A, Dong X, Halevy AY (2008) Bootstrapping pay-as-you-go data integration systems. In: Proceedings of ACM SIGMOD conference. ACM, NY, pp 861–874
Das Sarma A, Dong X, Halevy AY (2011) Uncertainty in data integration and dataspace support platforms. In: Bellahsene Z, Bonifati A, Rahm E (eds) Schema matching and mapping, Data-Centric Systems and Applications Series. Springer, Heidelberg
Do HH (2006) Schema Matching and Mapping-based Data Integration. Dissertation, Dept of Computer Science, Univ. of Leipzig
Do HH, Rahm E (2002) COMA – A System for Flexible Combination of Schema Matching Approaches. Proceedings VLDB Conf., pp 610–621
Do HH, Rahm E (2007) Matching large schemas: Approaches and evaluation. Inf Syst 32(6): 857–885
Do HH, Melnik S, Rahm E (2003) Comparison of schema matching evaluations. In: web, web-services, and database systems, LNCS, vol 2593. Springer, Heidelberg
Doan A, Madhavan J, Dhamankar R, Domingos P, Halevy AY (2003) Learning to match ontologies on the semantic web. VLDB J 12(4):303–319
Dong X, Halevy AY, Madhavan J, Nemes E, Zhang J (2004) Similarity search for web services. In: Proceedings of VLDB conference. VLDB Endowment, pp 372–383
Duchateau F, Coletta R, Bellahsene Z, Miller RJ (2009) (Not) yet another matcher. In: Proceedings of CIKM, poster paper. ACM, NY, pp 1537–1540
Ehrig M, Staab S (2004) Quick ontology matching. In: Proceedings of international conference semantic web (ICSW). LNCS, vol 3298. Springer, Heidelberg, pp 683–697
Ehrig M, Staab S, Sure Y (2005) Bootstrapping ontology alignment methods with APFEL. In: Proceedings of international conference on semantic web (ICSW). LNCS, vol 3729. Springer, Heidelberg, pp 1148–1149
Elmagarmid AK, Ipeirotis PG, Verykios VS (2007) Duplicate record detection: A survey. IEEE Trans Knowl Data Eng 19(1):1–16
Elmeleegy H, Ouzzani M, Elmagarmid AK (2008): Usage-based schema matching. In: Proceedings of ICDE conference. IEEE Computer Society, Washington, DC, pp 20–29
Euzenat J, Shvaiko P (2007) Ontology matching. Springer, Heidelberg
Euzenat J et al. (2009) Results of the ontology alignment evaluation initiative 2009. In: Proceedings of the 4th international workshop on Ontology Matching (OM-2009)
Fagin R, Haas LM, Hernández MA, Miller RJ, Popa L, Velegrakis Y (2009) Clio: Schema mapping creation and data exchange. In: Conceptual modeling: Foundations and applications. LNCS, vol 5600. Springer, Heidelberg
Falconer SM, Noy NF (2011) Interactive techniques to support ontology mapping. In: Bellahsene Z, Bonifati A, Rahm E (eds) Schema matching and mapping. Data-Centric Systems and Applications Series. Springer, Heidelberg
Gligorov R, ten Kate W, Aleksovski Z, van Harmelen F (2007) Using Google distance to weight approximate ontology matches. In: Proceedings WWW Conf., pp 767–776
Gross A, Hartung M, Kirsten T, Rahm E (2010) On matching large life science ontologies in parallel. In: Proceedings of 7th international conference on data integration in the life sciences (DILS). LNCS, vol 6254. Springer, Heidelberg
Gubanov M et al (2009) IBM UFO repository: Object-oriented data integration. PVLDB, demo paper. VLDB Endowment, pp 1598–1601
Hamdi F, Safar B, Reynaud C, Zargayouna H (2009) Alignment-based partitioning of large-scale ontologies. In: Advances in knowledge discovery and management. Studies in Computational Intelligence Series. Springer, Heidelberg
Hanif MS, Aono M (2009) An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. J Web Sem 7(4):344–356
He B, Chang KC (2006) Automatic complex schema matching across Web query interfaces: A correlation mining approach. ACM Trans. Database Syst 31(1):346–395
He H, Meng W, Yu CT, Wu Z (2004) Automatic integration of Web search interfaces with WISE-Integrator. VLDB J 13(3):256–273
Hu W, Qu Y, Cheng G (2008) Matching large ontologies: A divide-and-conquer-approach. Data Knowl Eng 67(1):140–160
Jean-Mary YR, Shironoshita EP, Kabuka MR (2009) Ontology matching with semantic verification. J Web Sem 7(3):235–251
Kappel G et al. (2007) Matching metamodels with semantic systems – An experience report. In: Proceedings of BTW workshop on model management, pp 1–15
Kirsten T, Thor A, Rahm E (2007) Instance-based matching of large life science ontologies. In: Proceedings of data integration in the life sciences (DILS). LNCS, vol 4544. Springer, Heidelberg, pp 172–187
Koepcke H, Rahm E (2010) Frameworks for entity matching: A comparison. Data Knowl Eng 69(2):197–210
Koudas N, Marathe A, Srivastava D (2004) Flexible string matching against large databases in practice. In: Proceedings of VLDB conference. VLDB Endowment, pp 1078–1086
Lambrix P, Tan H, Xu W (2008) Literature-based alignment of ontologies. In: Proceedings of the 3rd International Workshop on Ontology Matching (OM-2008)
Lee Y, Sayyadian M, Doan A, Rosenthal A (2007) eTuner: Tuning schema matching software using synthetic scenarios. VLDB J 16(1):97–122
Li J, Tang J, Li Y, Luo Q (2009) RiMOM: A dynamic multistrategy ontology alignment framework. IEEE Trans Knowl Data Eng 21(8):1218–1232
Madhavan J, Bernstein P A, Rahm E (2001) Generic Schema Matching with Cupid. In: Proceedings VLDB Conf., pp 49–58
Madhavan J, Bernstein PA, Doan A, Halevy AY (2005) Corpus-based schema matching. In: Proceedings of ICDE conference. IEEE Computer Society, Washington, DC, pp 57–68
Mao M, Peng Y, Spring M (2008) A harmony based adaptive ontology mapping approach. In: Proceedings of international conference on semantic web and web services (SWWS), pp 336–342
Massmann S, Rahm E (2008) Evaluating instance-based matching of web directories. In: Proceedings of 11th international Workshop on the Web and Databases (WebDB 2008)
Mork P, Seligman L, Rosenthal A, Korb J, Wolf C (2008) The harmony integration workbench. J Data Sem 11:65–93
Nandi A, Bernstein PA (2009) HAMSTER: Using search clicklogs for schema and taxonomy matching. PVLDB, vol 2(1), pp 181–192
Peukert E, Berthold H, Rahm E (2010a) Rewrite techniques for performance optimization of schema matching processes. In: Proceedings of 13th international conference on extending database technology (EDBT). ACM, NY, pp 453–464
Peukert E, Massmann S, König K (2010b) Comparing similarity combination methods for schema matching. In: Proceedings of 40th annual conference of the German computer society (GI-Jahrestagung). Lecture Notes in Informatics 175, pp 692–701
Pirrò G, Talia D (2010) UFOme: An ontology mapping system with strategy prediction capabilities. Data Knowl Eng 69(5):444–471
Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350
Rahm E, Do, HH, Massmann S (2004) Matching large XML schemas. SIGMOD Rec 33(4):26–31
Saha B, Stanoi I, Clarkson KL (2010) Schema covering: A step towards enabling reuse in information integration. In: Proceedings of ICDE conference, pp 285–296
Saleem K, Bellahsene Z, Hunt E (2008) PORSCHE: Performance oriented SCHEma mediation. Inf Syst 33(7–8):637–657
SAP (2010) Warp10 community-based integration. https://cw.sdn.sap.com/cw/docs/DOC-120470 (white paper), https://cw.sdn.sap.com/cw/community/esc/cdg135. Accessed April 2010
Seligman L, Mork P, Halevy AY et al (2010) OpenII: An open source information integration toolkit. In: Proceedings of ACM SIGMOD conference. ACM, NY, pp 1057–1060
Shi F, Li J et al (2009) Actively learning ontology matching via user interaction. In: Proceedings of international conference on semantic web (ICSW). Springer, Heidelberg, pp 585–600
Smith K, Morse M, Mork P et al (2009) The role of schema matching in large enterprises. In: Proceedings of CIDR
Spiliopoulos V, Vouros GA, Karkaletsis V (2010) On the discovery of subsumption relations for the alignment of ontologies. J Web Sem 8(1):69–88
Su W, Wang J, Lochovsky FH (2006) Holistic schema matching for web query interfaces. In: Proceedings of international conference on extending database technology (EDBT). Springer, Heidelberg, pp 77–94
Tan H, Lambrix P (2007) A method for recommending ontology alignment strategies. In: Proceedings of international conference on semantic web (ICSW). LNCS, vol 4825. Springer, Heidelberg
Thor A, Kirsten T, Rahm E (2007) Instance-based matching of hierarchical ontologies. In: Proceedings of 12th BTW conference (Database systems for business, technology and web). Lecture Notes in Informatics 103, pp 436–448
Zhang S, Mork P, Bodenreider O, Bernstein PA (2007) Comparing two approaches for aligning representations of anatomy. Artif Intell Med 39(3):227–236
Zhong Q, Li H et al. (2009) A gauss function based approach for unbalanced ontology matching. In: Proceedings of ACM SIGMOD conference. ACM, NY, pp 669–680
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Rahm, E. (2011). Towards Large-Scale Schema and Ontology Matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds) Schema Matching and Mapping. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16518-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-16518-4_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16517-7
Online ISBN: 978-3-642-16518-4
eBook Packages: Computer ScienceComputer Science (R0)