Abstract
This paper presents the different methodologies and resources used to build Galnet, the Galician version of WordNet. It reviews the different extraction processes and the lexicographical and textual sources used to develop this resource, and describes some of its applications in ontology research and terminology processing.
Notes
http://www.globalwordnet.org [17.05.2017].
http://compling.hss.ntu.edu.sg/omw/ [17.05.2017].
http://adimen.si.ehu.es/web/MCR/ [17.05.2017].
http://sli.uvigo.gal/galnet/ [17.05.2017].
More information about WordNet lexicographer files can be found at http://wordnet.princeton.edu/wordnet/man/lexnames.5WN.html [17.05.2017].
http://adimen.si.ehu.es/web/BLC/ [17.05.2017].
http://sli.uvigo.gal/CTG/ [17.05.2017].
http://sli.uvigo.gal/termoteca/ [17.05.2017].
http://adimen.si.ehu.es/web/files/mcr30/mcr30.zip [17.05.2017].
http://sourceforge.net/projects/apertium/ [17.05.2017].
http://gl.wiktionary.org [17.05.2017].
http://babelnet.org [17.05.2017].
http://sli.uvigo.gal/CLUVI/ [17.05.2017].
http://www.gabormelli.com/RKB/SemCor_Corpus [17.05.2017].
http://www.omegawiki.org [17.05.2017].
http://www.geonames.org [17.05.2017].
http://species.wikimedia.org [17.05.2017].
http://sli.uvigo.gal/sinonimos/ [17.05.2017].
We have tried this experiment also with two variants, which generated 25,186 candidates, but we have had to reject this approach in order to limit the number of results and improve even more the accuracy of the proposals (Solla Portela and Gómez Guinovart 2014).
http://sli.uvigo.gal/sinonimos/ [17.05.2017].
http://gl.dbpedia.org [17.05.2017].
http://dbpedia.org [17.05.2017].
http://sli.uvigo.gal/SensoGal/ [17.05.2017].
http://sli.uvigo.gal/galnet/galnet.php?version=dev&experiment=econonet [17.05.2017.], and similarly the rest of the experiments collected in this section.
http://sli.uvigo.gal/galnet/ [17.05.2017].
http://sli.uvigo.gal/RILG/ [17.05.2017].
http://www.ihtsdo.org/snomed-ct/ [17.05.2017].
http://sli.uvigo.gal/galnet/termonet.php [17.05.2017].
http://sli.uvigo.gal/CTG/ [17.05.2017].
http://nlp.lsi.upc.edu/freeling/ [17.05.2017].
The full results of this query are available at http://sli.uvigo.gal/galnet/termonet.php?ili=ili-30-06045562-n [17.05.2017].
http://sli.uvigo.gal/galnet/category.php [17.05.2017].
http://gl.dbpedia.org [17.05.2017].
References
Agirre, E., Alegria, I., Rigau, G., & Vossen, P. (2007). MCR for CLIR. Procesamiento del Lenguaje Natural, 38, 3–15.
Agirre, E., & Edmonds, P. (Eds.). (2009). Word sense disambiguation. Berlin: Springer.
Agirre, E., & Soroa, A. (2009). Personalizing PageRank for word sense disambiguation. In Proceedings of the 12th conference of the European chapter of the ACL (pp. 33–41).
Álvarez de la Granja, M. (2003). As locucións verbais galegas. Santiago de Compostela: Universidade de Santiago de Compostela.
Álvarez de la Granja, M., Gómez Clemente, X. M., & Gómez Guinovart, X. (2016). Introducing Idioms in the Galician WordNet: Methods problems and results. Open Linguistics, 2(1), 253–286.
Álvarez Lugrís, A., & Gómez Guinovart, X. (2014). Lexicografía bilingüe práctica basada en corpus: Planificación y elaboración del Dicionario Moderno Inglés-Galego. In M. J. Domínguez Vázquez, X. Gómez Guinovart, & C. Valcárcel Riveiro (Eds.), Lexicografía de las lenguas románicas: Aproximaciones a la lexicografía moderna y contrastiva (pp. 31–48). Berlin: De Gruyter Mouton.
Álvez, J., Atserias, J., Carrera, J., Climent, S., Oliver, A., & Rigau, G. (2008). Consistent annotation of EuroWordNet with the top concept ontology. In Proceedings of the 4th global WordNet conference, GWN, Szeged.
Bentivogli, L., Forner, P., Magnini, B., & Pianta, E. (2004). Revising WordNet domains hierarchy: Semantics, coverage, and balancing. In Proceedings of COLING workshop on multilingual linguistic resources, ACL, Geneva (pp. 101–108).
Cabré i Castellví, M. T. (1992). La terminologia. La teoria, els mètodes, les aplicacions. Editorial Empúries.
Elberrichi, Z., Rahmoun, A., & Bentaalah, M. A. (2008). Using WordNet for text categorization. The International Arab Journal of Information Technology, 5(1), 16–24.
Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database. Cambridge: MIT Press.
Fernández Montraveta, A., Vázquez, G., & Fellbaum, C. (2008). The Spanish version of WordNet 3.0. In A. Storrer, A. Geyken, A. Siebert, & K. M. Würzner (Eds.), Text resources and lexical knowledge. Selected papers from the 9th conference on natural language processing KONVENS 2008. Berlin: De Gruyter Mouton.
Ferrández, S., Ferrández, A., Roger, S., López-Moreno, P., & Peral, J. (2006). BRILI, an English–Spanish question answering system. In Proceedings of the international multiconference on computer science and information technology (Vol. 1, pp. 23–29).
Gómez Clemente, X. M., Gómez Guinovart, X., & Simões, A. (Eds.). (2015). Dicionario de sinónimos do galego. Vigo: Xerais.
Gómez Guinovart, X. (2014). Do dicionario de sinónimos á rede semántica: Fontes lexicográficas na construción do WordNet do galego. In A. G. Macedo, C. M. de Sousa, & V. Moura (Eds.), XV Colóquio de Outono—As humanidades e as ciências: Disjunções e confluências. Braga: CEHUM, Universidade do Minho.
Gómez Guinovart, X., & Oliver, A. (2014). Methodology and evaluation of the Galician WordNet expansion with the WN-Toolkit. Procesamiento del Lenguaje Natural, 53, 43–50.
Gómez Guinovart, X., & Simões, A. (2013). Retreading dictionaries for the 21st century. In J. P. Leal, R. Rocha, & A. Simões (Eds.), 2nd Symposium on languages, applications and technologies (pp. 115–126). Saarbrücken: Dagstuhl Publishing.
Gómez Guinovart, X., & Solla Portela, M. A. (2014). O dicionario de sinónimos como recurso para a expansión de WordNet. Linguamática, 6(2), 69–74.
Gonzalez-Agirre, A., Laparra, E., & Rigau, G. (2012). Multilingual central repository version 3.0. In Proceedings of the eight international conference on language resources and evaluation (LREC’12), ELRA, Istanbul.
Isahara, H., Bond, F., Uchimoto, K., Utiyama, M., & Kanzaki, K. (2008). Development of the Japanese WordNet. In Proceedings of the sixth international language resources and evaluation (LREC’08), ELRA, Marrakech.
Izquierdo, R., Suárez, A., & Rigau, G. (2007). Exploring the automatic selection of basic level concepts. In: G. Angelova, K. Bontcheva, R. Mitkov, N. Nicolov, & N. Nikolov (Eds.), Proceedings of the international conference on recent advances on natural language processing (RANLP’07), Incoma, Shoumen (pp. 298–302).
Izquierdo, R., Suárez, A., & Rigau, G. (2015). Word vs. class-based word sense disambiguation. Journal of Artificial Intelligence Research, 54, 83–122.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. (1990). WordNet: An on-line lexical database. International Journal of Lexicography, 3, 235–244.
Noia Campos, C., Gómez Clemente, X. M., & Benavente, P. (Eds.). (1997). Diccionario de sinónimos da lingua galega. Vigo: Galaxia.
Oliver, A. (2014). WN-Toolkit: Automatic generation of WordNets following the expand model. In Proceedings of the 7th global WordNet conference, GWN, Tartu.
Oliver, A., & Climent, S. (2014). Automatic creation of WordNets from parallel corpora. In Proceedings of the ninth international conference on language resources and evaluation (LREC’14), ELRA, Reykjavik (pp. 1112–1116).
Ordan, N., & Wintner, S. (2007). Hebrew WordNet: A test case of aligning lexical databases across languages. International Journal of Translation, 19(1), 39–58.
Pease, A., Niles, I., & Li, J. (2002). The suggested upper merged ontology: A large ontology for the semantic web and its applications. In Working notes of the AAAI-2002 workshop on ontologies and the semantic web, AAAI, Edmonton.
Plaza, L., Díaz, A., & Gervás, P. (2010). Automatic summarization of news using WordNet concept graphs. IADIS International Journal on Computer Science and Information Systems, 5(1), 45–57.
Pociello, E., Agirre, E., & Aldezabal, I. (2011). Methodology and construction of the Basque WordNet. Language Resources and Evaluation, 45(2), 121–142.
Real Academia de Medicina e Cirurxía de Galicia (2002) Diccionario galego de termos médicos. Xunta de Galicia, Santiago de Compostela.
Rodríguez Río, X. A. (Ed.). (2008). Vocabulario de medicina: galego-español-inglés-portugués. Santiago de Compostela: Universidade de Santiago de Compostela.
Rosch, E. (1978). Principles of categorization. In E. Rosch, B. B. Lloyd (Eds.), Cognition and categorization (pp. 27–48). Hillsdale: Lawrence Erlbaum.
Solla Portela, M.A., & Gómez Guinovart, X. (2014). Ampliación de WordNet mediante extracción léxica a partir de un diccionario de sinónimos. In L. A. Ureña, et al. (Eds.), Actas de las V Jornadas de la Red en Tratamiento de la Información Multilingüe y Multimodal, CEUR Workshop Proceedings (CEUR-WS.org), Aachen (Vol. 1199, pp. 29–32).
Solla Portela, M. A., & Gómez Guinovart, X. (2016). DBpedia del gallego: Recursos y aplicaciones en procesamiento del lenguaje. Procesamiento del Lenguaje Natural, 57, 139–142.
V̌intar, S., Fišer, D., & Vrščaj, A. (2012). Were the clocks striking or surprising? Using WSD to improve MT performance. In Proceedings of the joint workshop on exploiting synergies between information retrieval and machine translation (ESIRMT) and hybrid approaches to machine translation (HyTra) (EACL 2012), ACL, Stroudsburg (pp. 87–92).
Vossen, P. (Ed.). (1998). EuroWordNet: A multilingual database with lexical semantic networks. Norwell: Kluwer Academic Publishers.
Vossen, P. (2002). WordNet, EuroWordNet and global WordNet. Revue française de linguistique appliquée, 7, 27–38.
Zhao, F., Fang, F., Yan, F., Jin, H., & Zhang, Q. (2012). Expanding approach to information retrieval using semantic similarity analysis based on WordNet and Wikipedia. International Journal of Software Engineering and Knowledge Engineering, 22(2), 305–322.
Acknowledgements
We would like to express our gratitude to Patricia Sotelo Dios for her invaluable help in the writing of this article.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research has been carried out thanks to the projects SKATeR (TIN2012-38584-C06-04) and TUNER (TIN2015-65308-C5-1-R) supported by the Ministry of Economy and Competitiveness of the Spanish Government and the European Fund for Regional Development (MINECO/FEDER).
Rights and permissions
About this article
Cite this article
Guinovart, X.G., Portela, M.A.S. Building the Galician wordnet: methods and applications. Lang Resources & Evaluation 52, 317–339 (2018). https://doi.org/10.1007/s10579-017-9408-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-017-9408-5