Abstract
We present a large scale multilingual lexical resource, the Universal Knowledge Core (UKC), which is organized like a Wordnet with, however, a major design difference. In the UKC the meaning of words is represented not only with synsets but also using language independent concepts which cluster together the synsets which, in different languages, codify the same meaning. In the UKC, it is concepts and not synsets, as it is the case in the Wordnets, which are connected in a semantic network. The use of language independent concepts allows for the native integrability, analysis and use of any number of languages, with important applications in, e.g., multilingual language processing, reasoning (as needed, for instance, in data and knowledge integration) and image understanding.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
See http://globalwordnet.org/ for a compilation of the most relevant resources available today.
- 2.
The word knowledge in UKC is motivated by our focus on studying language not per se but as a key component of reasoning systems.
- 3.
From February 2018, the UKC will be browsable on line at the link http://kidf.eu.
- 4.
References
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to wordnet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235–244 (1990)
Vossen, P.: Introduction to EuroWordNet. Comput. Humanit. 32(2–3), 73–89 (1998)
Pianta, E., Bentivogli, L., Girardi, C.: Multi-wordnet: developing an aligned multilingual database. In: Proceedings of the First International Conference on Global WordNet, Mysore, India, pp. 21–25, January 2002
Gonzalez-Agirre, A., Laparra, E., Rigau, G.: Multilingual central repository version 3.0. In: LREC, pp. 2525–2529 (2012)
Giunchiglia, F., Batsuren, K., Bella, G.: Understanding and exploiting language diversity. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), pp. 4009–4017 (2017)
Henrich, J., Heine, S.J., Norenzayan, A.: The weirdest people in the world? Behav. Brain Sci. 33(2–3), 61–83 (2010)
Bella, G., Giunchiglia, F., McNeill, F.: Language and domain aware lightweight ontology matching. In: Web Semantics: Science, Services and Agents on the World Wide Web (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009 (2009)
Deng, J., Russakovsky, O., Krause, J., Bernstein, M., Berg, A.C., Fei-Fei, L.: Scalable multi-label annotation. In: ACM Conference on Human Factors in Computing Systems (CHI) (2014)
Giunchiglia, F., Fumagalli, M.: Concepts as (recognition) abilities. In: Formal Ontology in Information Systems: Proceedings of the 9th International Conference (FOIS 2016), pp. 153–166 (2016)
Giunchiglia, F., Fumagalli, M.: Teleologies: objects, actions and functions. In: Proceedings of the 36th International Conference on Conceptual Modeling (ER 2017) (2017)
Millikan, R.G.: On Clear and Confused Ideas: An Essay About Substance Concepts. Cambridge University Press, Cambridge (2000)
Kjellmer, G.: Lexical gaps. Lang. Comput. 48(1), 149–158 (2003)
Bentivogli, L., Pianta, E.: Looking for lexical gaps. In: Proceedings of the Ninth EURALEX International Congress, pp. 8–12. Universität Stuttgart, Stuttgart (2000)
Cvilikaitė, J.: Lexical gaps: resolution by functionally complete units of translation. Darbai ir dienos 2006(45), 127–142 (2006)
Lehrer, A.: Notes on lexical gaps. J. Linguist. 6(2), 257–261 (1970)
Crowley, T., Bowern, C.: An Introduction to Historical Linguistics, 4 edn. Oxford University Press, Oxford (2010)
Croft, W.: Typology and Universals. Cambridge University Press, Cambridge (2002)
Rijkhoff, J., Bakker, D., Hengeveld, K., Kahrel, P.: A method of language sampling. Studies in language. Int. J. sponsored Found. “Found. Lang. 17(1), 169–203 (1993)
Bell, A.: Language samples. Univers. Hum. Lang. 1, 123–156 (1978)
McMahon, A., McMahon, R.: Language Classification by Numbers. Oxford University Press on Demand, Oxford (2005)
Swadesh, M.: Towards greater accuracy in lexicostatistic dating. Int. J. Am. Linguist. 21(2), 121–137 (1955)
Swadesh, M.: The Origin and Diversification of Language. Transaction Publishers, Piscataway (1971)
Greenberg, J.H.: Univers. Lang. (1966)
Youn, H., et al.: On the universal structure of human lexical semantics. Proc. Natl. Acad. Sci. 113(7), 1766–1771 (2016)
Miller, G.A.: Nouns in wordnet: a lexical inheritance system. Int. J. Lexicogr. 3(4), 245–264 (1990)
Navigli, R., Ponzetto, S.P.: Babelnet: building a very large multilingual semantic network. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 216–225. Association for Computational Linguistics (2010)
Bond, F., Foster, R.: Linking and extending an open multilingual wordnet. In: ACL, vol. 1, pp. 1352–1362 (2013)
Bond, F., Vossen, P., McCrae, J.P., Fellbaum, C.: Cili: the collaborative interlingual index. In: Proceedings of the Global WordNet Conference, vol. 2016 (2016)
Vossen, P., Bond, F., McCrae, J.: Toward a truly multilingual globalwordnet grid. In: Proceedings of the Eighth Global WordNet Conference, pp. 25–29 (2016)
Von Fintel, K., Matthewson, L.: Universals in semantics. Linguist. Rev. 25(1–2), 139–201 (2008)
Giunchiglia, F., Autayeu, A., Pane, J.: S-match: an open source framework for matching lightweight ontologies. Semant. Web 3(3), 307–317 (2012)
Acknowledgments
The first version of the UKC was developed by Ilya Zaihrayeu, around 2004. This implementation was revised many times, most often as a joint effort between Ilya and Marco Marasca. Since the beginning, the UKC has been designed with the goal of supporting the automation of reasoning based on information extracted from text, the original goal being the matching of ontologies [32]. We thank the many postdocs and PhD students who have extensively used the UKC in their research.
The current work is supported by QROWD (http://qrowd-project.eu), a Horizon 2020 project, under Grant Agreement No. 732194. The second author is supported by the ESSENCE Marie Curie Initial Training Network, funded by the European Commission’s 7th Framework Programme under grant agreement no. 607062.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Giunchiglia, F., Batsuren, K., Alhakim Freihat, A. (2023). One World - Seven Thousand Languages (Best Paper Award, Third Place). In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13396. Springer, Cham. https://doi.org/10.1007/978-3-031-23793-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-23793-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23792-8
Online ISBN: 978-3-031-23793-5
eBook Packages: Computer ScienceComputer Science (R0)