Skip to main content

Abstract

In the paper we present a novel method of wordnets’ data integration. The proposed method is based on the XML representation of wordnets content. In particular, we focus on the integration of VisDic-based documents representing the data of two Polish wordnets, i.e. plWordNet and Polnet. One of the key features of the method is that it is able to automatically identify and handle the discrepancies existing in the structure of the integrated documents. Apart from the method itself, we briefly discuss a C#-based implementation of the method. Finally, we present some statistical measures related to the data available before and after the integration process. The statistical comparison allows us to determine, among other things, the impact of particular wordnets on the integrated set of data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amaro, R., Mendes, S.: Towards merging common and technical lexicon wordnets. In: Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon (CogALex-III): 24th International Conference on Computational Linguistics, COLING 2012, pp. 147–160 (2012)

    Google Scholar 

  2. Arfaoui, N., Akaichi, J.: Automating schema integration technique case study: generating data warehouse schema from data mart schemas. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015. CCIS, vol. 521, pp. 200–209. Springer, Cham (2015). doi:10.1007/978-3-319-18422-7_18

    Google Scholar 

  3. Bach, M., Kozielski, S., Świderski, M.: Zastosowanie ontologii do opisu semantyki relacyjnej bazy danych na potrzeby analizy zapytań w języku naturalnym. Studia Informatica 30(2A(83)), 187–199 (2009). Presented at BDAS 2009

    Google Scholar 

  4. Biemann, C.: Ontology learning from text: a survey of methods. LDV Forum 20(2), 75–93 (2005)

    Google Scholar 

  5. Cupek, R., Ziebinski, A., Fojcik, M.: An ontology model for communicating with an autonomous mobile platform. In: Kozielski, S., Kasprowski, P., Mrozek, D., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2017. CCIS, vol. 716, pp. 480–493. Springer, Cham (2017)

    Google Scholar 

  6. Euzenat, J., Schvaiko, P.: Ontology Matching. Springer, Heidelberg (2013)

    Book  Google Scholar 

  7. Goczyła, K., Zawadzka, T.: Zależności między ontologiami i ich wpływ na problem integracji ontologii. In: Kozielski, S., Małysiak, B., Kasprowski, P., Mrozek, D. (eds.) Bazy Danych: Struktury, Algorytmy, Metody, pp. 331–340. WKŁ, Warsaw (2006)

    Google Scholar 

  8. Hajnicz, E.: Automatyczne tworzenie semantycznych słowników walencyjnych. Akademicka Oficyna Wydawnicza EXIT, Warsaw (2011)

    Google Scholar 

  9. Horák, A., Smrž, P.: VisDic - wordnet browsing and editing tool. In: Sojka, P., Pala, K., Smrž, P., Fellbaum, C., Vossen, P. (eds.) Proceedings of the 2nd International WordNet Conference, pp. 136–141 (2003)

    Google Scholar 

  10. Horák, A., Smrž, P.: New features of wordnet editor VisDic. Rom. J. Inf. Sci. Technol. 7(1–2), 1–13 (2004)

    Google Scholar 

  11. Hossain, J., Sani, F., Affendey, L.S., Ishak, I., Kasmiran, K.A.: Semantic schema matching approaches: a review. J. Theor. Appl. Inf. Technol. 62(1), 139–147 (2014)

    Google Scholar 

  12. Ibrahim, H., Karasneh, Y., Mirabi, M., Yaakob, R., Othman, M.: An automatic domain independent schema matching in integrating schemas of heterogeneous relational databases. J. Inf. Sci. Eng. 30, 1505–1536 (2014)

    Google Scholar 

  13. Jastrząb, T., Kwiatkowski, G., Sadowski, P.: Mapping of selected synsets to semantic features. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015-2016. CCIS, vol. 613, pp. 357–367. Springer, Cham (2016). doi:10.1007/978-3-319-34099-9_28

    Chapter  Google Scholar 

  14. Kwak, J., Yong, H.S.: Ontology matching based on hypernym, hyponym, holonym, and meronym sets in wordnet. Int. J. Semant. Technol. 1(2), 1–14 (2010)

    Article  Google Scholar 

  15. Lawrence, R., Barker, K.: Integrating relational database schemas using a standardized dictionary. In: Proceedings of the 2001 ACM Symposium on Applied Computing (SAC 2001), pp. 225–230. ACM (2001)

    Google Scholar 

  16. Magnini, B., Speranza, M.: Integrating generic and specialized wordnets. In: Proceedings of the 2nd Conference on Recent Advances in Natural Language Processing (RANLP 2001) (2001)

    Google Scholar 

  17. Mahdi, A.M., Tiun, S.: Utilizing wordnet for instance-based schema matching. In: Proceedings of the International Conference on Advances in Computer Science and Electronics Engineering (CSEE 2014), pp. 59–63. Institute of Research Engineers and Doctors (2014)

    Google Scholar 

  18. Maziarz, M., Piasecki, M., Rudnicka, E., Szpakowicz, S., Kędzia, P.: plWordNet 3.0 - a comprehensive lexical-semantic resource. In: Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers, COLING 2016, pp. 2259–2268 (2016)

    Google Scholar 

  19. Miller, G.A.: Nouns in wordnet: a lexical inheritance system. Int. J. Lexicogr. 3(4), 245–264 (1990)

    Article  Google Scholar 

  20. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to wordnet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235–244 (1990)

    Article  Google Scholar 

  21. Mykowiecka, A.: Inżynieria lingwistyczna: komputerowe przetwarzanie tekstów w jȩzyku naturalnym. Wydawnictwo PJWSTK, Warsaw (2007)

    Google Scholar 

  22. Piasecki, M., Szpakowicz, S., Broda, B.: Toward plWordNet 2.0. In: Bhattacharyya, P., Fellbaum, C., Vossen, P. (eds.) Proceedings of the 5th Global Wordnet Conference on Principles, Construction and Application of Multilingual Wordnets, pp. 263–270. Narosa Publishing House (2010)

    Google Scholar 

  23. Rahm, E., Bernstein, P.: A survey of approaches to automatic schema matching. VLDB J. 10, 334–350 (2001)

    Article  MATH  Google Scholar 

  24. Świderski, M.: Metodologia LAV w systemie semantycznej integracji geoprzestrzennych źródeł danych. In: Kozielski, S., Małysiak, B., Kasprowski, P., Mrozek, D. (eds.) Bazy Danych: Modele, Technologie, Narzędzia, pp. 213–220. WKŁ, Warsaw (2005)

    Google Scholar 

  25. Vetulani, Z.: Komunikacja człowieka z maszyną. Akademicka Oficyna Wydawnicza EXIT, Warsaw (2014)

    Google Scholar 

  26. Vetulani, Z., Vetulani, G., Kochanowski, B.: Recent advances in development of a lexicon-grammar of polish: Polnet 3.0. In: Calzolari, N., Choukri, K., et al. (eds.) Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 2851–2854. European Language Resources Association (ELRA) (2016)

    Google Scholar 

  27. Xiang, C., Jiang, T., Chang, B., Sui, Z.: ERSOM: A structural ontology matching approach using automatically learned entity representation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2419–2429 (2015)

    Google Scholar 

  28. Ziebinski, A., Cupek, R., Erdogan, H., Waechter, S.: A survey of ADAS technologies for the future perspective of sensor fusion. In: Nguyen, N.-T., Manolopoulos, Y., Iliadis, L., Trawiński, B. (eds.) ICCCI 2016. LNCS (LNAI), vol. 9876, pp. 135–146. Springer, Cham (2016). doi:10.1007/978-3-319-45246-3_13

    Chapter  Google Scholar 

Download references

Acknowledgments

The reported study was partially supported by the European Union from the FP7-PEOPLE-2013-IAPP AutoUniMo project Automotive Production Engineering Unified Perspective based on Data Mining Methods and Virtual Factory Model (grant agreement no. 612207) and research work financed from funds for science in years 2016–2017 allocated to an international co-financed project (grant agreement no: 3491/7.PR/15/2016/2). It was also partially supported by Institute of Informatics research grant no. BKM/507/RAU2/2016.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomasz Jastrząb .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Krasnokucki, D., Kwiatkowski, G., Jastrząb, T. (2017). A New Method of XML-Based Wordnets’ Data Integration. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation. BDAS 2017. Communications in Computer and Information Science, vol 716. Springer, Cham. https://doi.org/10.1007/978-3-319-58274-0_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58274-0_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58273-3

  • Online ISBN: 978-3-319-58274-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics