Hostname: page-component-8448b6f56d-mp689 Total loading time: 0 Render date: 2024-04-23T15:43:54.773Z Has data issue: false hasContentIssue false

A large dataset for the evaluation of ontology matching

Published online by Cambridge University Press:  01 June 2009

Fausto Giunchiglia*
Affiliation:
Department of Information Engineering and Computer Science (DISI), University of Trento, 38050 Povo, Trento, Italy
Mikalai Yatskevich*
Affiliation:
Department of Information Engineering and Computer Science (DISI), University of Trento, 38050 Povo, Trento, Italy
Paolo Avesani*
Affiliation:
Fondazione Bruno Kessler, Via Sommarive 18, 38050 Povo, Trento, Italy
Pavel Shivaiko*
Affiliation:
Department of Information Engineering and Computer Science (DISI), University of Trento, 38050 Povo, Trento, Italy

Abstract

Recently, the number of ontology matching techniques and systems has increased significantly. This makes the issue of their evaluation and comparison more severe. One of the challenges of the ontology matching evaluation is in building large-scale evaluation datasets. In fact, the number of possible correspondences between two ontologies grows quadratically with respect to the numbers of entities in these ontologies. This often makes the manual construction of the evaluation datasets demanding to the point of being infeasible for large-scale matching tasks. In this paper, we present an ontology matching evaluation dataset composed of thousands of matching tasks, called TaxME2. It was built semi-automatically out of the Google, Yahoo, and Looksmart web directories. We evaluated TaxME2 by exploiting the results of almost two-dozen of state-of-the-art ontology matching systems. The experiments indicate that the dataset possesses the desired key properties, namely it is error-free, incremental, discriminative, monotonic, and hard for the state-of-the-art ontology matching systems.

Type
Original Article
Copyright
Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ashpole, B., Ehrig, M., Euzenat, J., Stuckenschmidt, H. (eds). 2005. Proceedings of the Workshop on Integrating Ontologies at the International Conference on Knowledge Capture (K-CAP).Google Scholar
Avesani, P., Giunchiglia, F., Yatskevich, M. 2005. A large scale taxonomy mapping evaluation. In Proceedings of the International Semantic Web Conference (ISWC), 67–81.Google Scholar
Bergamaschi, S., Castano, S., Vincini, M. 1999. Semantic integration of semistructured and structured data sources. SIGMOD Record 28(1), 5459.CrossRefGoogle Scholar
Bodenreider, O., Hayamizu, T. F., Ringwald, M., De Coronado, S., Zhang, S. 2005. Of mice and men: aligning mouse and human anatomies. In Proceedings of the American Medical Informatics Association (AIMA) Annual Symposium, 61–65.Google Scholar
Bouquet, P., Serafini, L., Zanobini, S. 2003. Semantic coordination: a new approach and an application. In Proceedings of the International Semantic Web Conference (ISWC), 130–145.Google Scholar
Dhamankar, R., Lee, Y., Doan, A.-H., Halevy, A., Domingos, P. 2004. iMAP: discovering complex semantic matches between database schemas. In Proceedings of the International Conference on Management of Data (SIGMOD), 383–394.Google Scholar
Do, H.-H., Rahm, E. 2002. COMA—a system for flexible combination of schema matching approaches. In Proceedings of the International Conference on Very Large Databases, 610–621.Google Scholar
Doan, A.-H., Halevy, A. 2005. Semantic integration research in the database community: a brief survey. AI Magazine, special issue on semantic integration 26(1), 8394.Google Scholar
Doan, A.-H., Madhavan, J., Dhamankar, R., Domingos, P., Halevy, A. 2003. Learning to map ontologies on the semantic web. The VLDB Journal 12(4), 303319.CrossRefGoogle Scholar
Ehrig, M., Staab, S., Sure, Y. 2005. Bootstrapping ontology alignment methods with APFEL. In Proceedings of the International Semantic Web Conference (ISWC), 186–200.Google Scholar
Ehrig, M., Sure, Y. 2004. Ontology mapping—an integrated approach. In Proceedings of the European Semantic Web Symposium (ESWS), 76–91.Google Scholar
Euzenat, J. 2007. Semantic precision and recall for ontology alignment evaluation. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 348–353.Google Scholar
Euzenat, J., Isaac, A., Meilicke, C., Shvaiko, P., Stuckenschmidt, H., Šváb, O., Svátek, V., van Hage, W. R., Yatskevich, M. 2007. Results of the ontology alignment evaluation initiative 2007. In Proceedings of the International Workshop on Ontology Matching (OM) at the International Semantic Web Conference (ISWC) + Asian Semantic Web Conference (ASWC).Google Scholar
Euzenat, J., Shvaiko, P. 2007. Ontology Matching. Springer-Verlag.Google Scholar
Euzenat, J., Shvaiko, M. M. P., Stuckenschmidt, H., Šváb, O., Svátek, V., van Hage, W. R., Yatskevich, M. 2006. Results of the ontology alignment evaluation initiative 2006. In Proceedings of the International Workshop on Ontology Matching (OM) at the International Semantic Web Conference (ISWC).Google Scholar
Euzenat, J., Stuckenschmidt, H., Yatskevich, M. 2005. Introduction to the ontology alignment evaluation 2005. In Proceedings of the Workshop on Integrating Ontologies at the International Conference on Knowledge Capture (K-CAP).Google Scholar
Euzenat, J., Valtchev, P. 2004. Similarity-based ontology alignment in OWL-lite. In Proceedings of the European Conference on Artificial Intelligence (ECAI), 333–337.Google Scholar
Gal, A., Anaby-Tavor, A., Trombetta, A., Montesi, D. 2005. A framework for modeling and evaluating automatic semantic reconciliation. The VLDB Journal 14, 5067.CrossRefGoogle Scholar
García-Castro, R., Gómez-Pérez, A. 2005. Guidelines for benchmarking the performance of ontology management APIs. In Proceedings of the International Semantic Web Conference (ISWC), 277–292.Google Scholar
Giunchiglia, F., Marchese, M., Zaihrayeu, I. 2007. Encoding classifications into lightweight ontologies. Journal on Data Semantics VIII, 5781.Google Scholar
Giunchiglia, F., Shvaiko, P. 2003. Semantic matching. The Knowledge Engineering Review 18(3), 265280.Google Scholar
Giunchiglia, F., Shvaiko, P., Yatskevich, M. 2004. S-Match: an algorithm and an implementation of semantic matching. In Proceedings of the European Semantic Web Symposium (ESWS), 61–75.Google Scholar
Giunchiglia, F., Shvaiko, P., Yatskevich, M. 2006. Discovering missing background knowledge in ontology matching. In Proceedings of the European Conference on Artificial Intelligence (ECAI), 382–386.Google Scholar
Giunchiglia, F., Yatskevich, M. 2004. Element level semantic matching. In Proceedings of the Meaning Coordination and Negotiation Workshop at the International Semantic Web Conference (ISWC).Google Scholar
Giunchiglia, F., Yatskevich, M., Giunchiglia, E. 2005. Efficient semantic matching. In Proceedings of the European Semantic Web Conference (ESWC), 272–289.Google Scholar
Giunchiglia, F., Yatskevich, M., Shvaiko, P. 2007. Semantic matching: algorithms and implementation. Journal on Data Semantics IX, 138.Google Scholar
Goren-Bar, D., Kuflik, T. 2005. Supporting user-subjective categorization with self-organizing maps and learning vector quantization. Journal of the American Society for Information Science and Technology 56(4), 345355.CrossRefGoogle Scholar
He, B., Chang, K. 2006. Automatic complex schema matching across web query interfaces: a correlation mining approach. ACM Transactions on Database Systems 31(1), 145.Google Scholar
Ichise, R., Takeda, H., Honiden, S. 2003. Integrating multiple internet directories by instance-based learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 22–30.Google Scholar
Kalfoglou, Y., Schorlemmer, M. 2003. Ontology mapping: the state of the art. The Knowledge Engineering Review 18(1), 131.CrossRefGoogle Scholar
Kang, J., Naughton, J. 2003. On schema matching with opaque column names and data values. In Proceedings of the International Conference on Management of Data (SIGMOD), 205–216.Google Scholar
Lambrix, P., Tan, H. 2006. SAMBO—a system for aligning and merging biomedical ontologies. Journal of Web Semantics 4(3), 196206.Google Scholar
Madhavan, J., Bernstein, P., Rahm, E. 2001. Generic schema matching using Cupid. In Proceedings of the International Conference on Very Large Data Bases (VLDB), 48–58.Google Scholar
Melnik, S., Garcia-Molina, H., Rahm, E. 2002. Similarity flooding: a versatile graph matching algorithm. In Proceedings of the International Conference on Data Engineering (ICDE), 117–128.Google Scholar
Modica, G., Gal, A., Jamil, H. 2001. The use of machine-generated ontologies in dynamic information seeking. In Proceedings of the International Conference on Cooperative Information Systems (CoopIS), 433–448.Google Scholar
Noy, N. 2004. Semantic integration: a survey of ontology-based approaches. SIGMOD Record 33(4), 6570.Google Scholar
Noy, N., Musen, M. 2003. The PROMPT suite: interactive tools for ontology merging and mapping. International Journal of Human-Computer Studies 59(6), 9831024.Google Scholar
Qu, Y., Hu, W., Chen, G. 2006. Constructing virtual documents for ontology matching. In Proceedings of the World Wide Web Conference (WWW), 23–31.Google Scholar
Rahm, E., Bernstein, P. 2001. A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334350.CrossRefGoogle Scholar
van Rijsbergen, C. J. K. 1979. Information Retrieval, 2nd edn. Butterworths.Google Scholar
Shvaiko, P., Euzenat, J. 2005. A survey of schema-based matching approaches. Journal on Data Semantics IV, 146171.Google Scholar
Shvaiko, P., Euzenat, J., Giunchiglia, F., He, B. (eds). 2007a. In Proceedings of the International Workshop on Ontology Matching (OM) at the International Semantic Web Conference (ISWC) + Asian Semantic Web Conference (ASWC).Google Scholar
Shvaiko, P., Euzenat, J., Noy, N., Stuckenschmidt, H., Benjamins, R., Uschold, M. (eds). 2006. Proceedings of the International Workshop on Ontology Matching (OM) at the International Semantic Web Conference (ISWC).Google Scholar
Shvaiko, P., Euzenat, J., Stuckenschmidt, H., Mochol, M., Giunchiglia, F., Yatskevich, M., Avesani, P., van Hage, W. R., Šváb, O., Svátek, V. 2007b. KnowledgeWeb Deliverable 2.2.9: description of alignment evaluation and benchmarking results. http://exmo.inrialpes.fr/cooperation/kweb/heterogeneity/deli/kweb-229.pdfGoogle Scholar
Su, W., Wang, J., Lochovsky, F. H. 2006. Holistic schema matching for web query interfaces. In Proceedings of the International Conference on Extending Database Technology (EDBT), 77–94.Google Scholar
Sure, Y., Corcho, O., Euzenat, J., Hughes, T. (eds). 2004. Proceedings of the Workshop on Evaluation of Ontology-based tools (EON) at the International Semantic Web Conference (ISWC).Google Scholar
Tang, J., Li, J., Liang, B., Huang, X., Li, Y., Wang, K. 2006. Using Bayesian decision for ontology mapping. Journal of Web Semantics 4(1), 243262.Google Scholar
Zhang, S., Bodenreider, O. 2007. Experience in aligning anatomical ontologies. International Journal on Semantic Web and Information Systems 3(2), 126.Google Scholar
Ziegler, P., Kiefer, C., Sturm, C., Dittrich, K., Bernstein, A. 2006. Detecting similarities in ontologies with the SOQA-SimPack toolkit. In Proceedings of the International Conference on Extending Database Technology (EDBT), 59–76.Google Scholar