Abstract
This paper presents a cross-lingual methodology for analyzing verbal argument structures to uncover shared syntax-semantic patterns among verbal complements across languages. The primary contribution is a novel semantic model for encoding verbal arguments in multiple languages. The methodology is rooted in the k-Multilingual Concept (\(MC^k\)) model, a state-of-the-art automated system designed for retrieving and aligning semantically-equivalent lexical items across k different languages. We integrated WordNet, BabelNet, and VerbNet into a framework that accommodates the unique demands of verbal context. The methodology is implemented in a highly-scalable pipeline, creating VerbAligNet, a new resource that encodes over 6k verbal arguments for 600+ verb senses, showcasing prevalent usage patterns across 9 valency frames on three languages. The evaluation demonstrates its accuracy in extracting semantically-equivalent verbal arguments for diverse verbs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Examples are all taken from the source article [14].
- 2.
For example, the verb break can be used transitively (John broke the window) or intransitively (The window broke), representing one transitivity alternation.
- 3.
The Recipient can be also expressed as an oblique object, encoded into a prepositional phrase (PP) (e.g. “She gave the book to him”). However, this frame (NP V NP PP.recipient) is not included in the scope of our analysis.
- 4.
BabelNet high-quality lexicalizations are those word forms that are not marked as resulting from an automatic translation.
- 5.
- 6.
Updates to the resource will be provided in the same repository.
- 7.
- 8.
The annotator who performed the evaluation is a native Italian speaker with a minimum C1 proficiency level in English and German. Therefore, the evaluation is assured by solid accuracy.
- 9.
That is, \(EN\) \(\leftrightarrow \) \(IT\) ; \(EN\) \(\leftrightarrow \) \(DE\) ; \(DE\) \(\leftrightarrow \) \(IT\) .
- 10.
References
Baisa, V., Michelfeit, J., Medveď, M., Jakubíček, M.: European union language resources in sketch engine. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 2799–2803 (2016)
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: Annual Meeting of the Association for Computational Linguistics (1998)
Bird, S.: NLTK: the natural language toolkit. In: Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp. 69–72 (2006)
Bond, F., Foster, R.: Linking and extending an open multilingual wordnet. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1352–1362 (2013)
Bond, F., Vossen, P., McCrae, J., Fellbaum, C.: CILI: the collaborative interlingual index. In: Proceedings of the 8th Global WordNet Conference (GWC), pp. 50–57. Global Wordnet Association, Bucharest, Romania 27–30 January 2016
Bowerman, M.: Mapping thematic roles onto syntactic functions: are children helped by innate linking rules? (1990)
Comrie, B.: Language universals and linguistic typology: Syntax and morphology (1981)
Croft, W.B.: Typology and universals (1994)
Dixon, R.M.W., Aikhenvald, A.Y.: Changing valency case studies in transitivity: Introduction (2000)
Dryer, M.: Order of subject, object and verb. The World Atlas of Language Structures, pp. 330–333 (2005)
Gildea, D., Jurafsky, D.: Automatic labeling of semantic roles. Comput. Linguist. 28, 245–288 (2002)
González, A.Á., Navarro, Í.: Verb valency changes: Theoretical and typological perspectives (2017)
Grasso, F., Di Caro, L.: A methodology for large-scale, disambiguated and unbiased lexical knowledge acquisition based on multilingual word alignment. In: Fersini, E., Passarotti, M., Patti, V. (eds.) Proceedings of the Eighth Italian Conference on Computational Linguistics, CLiC-it 2021, Milan, Italy, 26–28 January 2022. CEUR Workshop Proceedings, vol. 3033. CEUR-WS.org (2021)
Grasso, F., Lovera Rulfi, V., Di Caro, L.: MultialigNet: cross-lingual knowledge bridges between words and senses. In: International Conference Knowledge Engineering and Knowledge Management (2022)
Greenberg, J.H.: Some universals of grammar with particular reference to the order of meaningful elements. On Language (1990)
Grefenstette, G.: Cross language information retrieval. In: Conference of the Association for Machine Translation in the Americas (1998)
Hamp, B., Feldweg, H.: GermaNet - a lexical-semantic net for German. In: Workshop On Automatic Information Extraction And Building Of Lexical Semantic Resources For NLP Applications (1997)
Hellan, L., Malchukov, A.L., Cennamo, M.: Introduction. Issues in contrastive valency studies (2017)
Jakubíček, M., Kilgarriff, A., Kovář, V., Rychlỳ, P., Suchomel, V.: The tenten corpus family. In: 7th International Corpus Linguistics Conference CL, pp. 125–127 (2013)
Kamholz, D., Pool, J., Colowick, S.M.: Panlex: building a resource for panlingual lexical translation. In: LREC, pp. 3145–3150 (2014)
Kilgarriff, A., et al.: The sketch engine: Ten years on. Lexicography 1(1), 7–36 (2014)
Levin, B.: English verb classes and alternations: a preliminary investigation (1993)
Luraghi, S.: Basic valency orientation, the anticausative alternation, and voice in pie. In: Akten der 16th Fachtagung der Indogermanischen Gesellschaft, pp. 259–274 (2019)
Majewska, O., et al.: Bioverbnet: a large semantic-syntactic classification of verbs in biomedicine. J. Biomed. Semant. 12 (2021)
Marantz, A.: Verbal argument structure: events and participants. Lingua 130, 152–168 (2013)
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Miller, G.A., Chodorow, M., Landes, S., Leacock, C., Thomas, R.G.: Using a semantic concordance for sense identification. In: Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, 8-11 March 1994 (1994)
Navigli, R., Ponzetto, S.P.: BabelNet: building a very large multilingual semantic network. In: Procedings of ACL, pp. 216–225. Association for Computational Linguistics (2010)
Nichols, J., Peterson, D.A., Barnes, J.: Transitivizing and detransitivizing languages. Linguistic Typol. 8 (2004)
Palmer, M., Kingsbury, P.R., Gildea, D.: The proposition bank: an annotated corpus of semantic roles. Comput. Linguist. 31, 71–106 (2005)
Roventini, A., Alonge, A., Calzolari, N., Magnini, B.: ItalWordNet: a large semantic database for Italian (2000)
Schuler, K.K., Palmer, M.: VerbNet: a broad-coverage, comprehensive verb lexicon (2005)
Siegel, M., Bond, F.: OdeNet: compiling a GermanWordNet from other resources. In: Proceedings of the 11th Global Wordnet Conference, pp. 192–198. Global Wordnet Association, University of South Africa (UNISA) (2021). https://aclanthology.org/2021.gwc-1.22
Talmy, L.: (1) lexicalization patterns: Semantic structure in lexical forms and (1987)
Trampuš, M., Novak, B.: Internals of an aggregated web news feed. In: Proceedings of 15th Multiconference on Information Society, pp. 221–224 (2012)
Vulic, I., Moens, M.F.: Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Grasso, F., Rulfi, V.L., Caro, L.D. (2024). VerbAligNet: Unlocking Multilingual Exploration of Verbal Arguments. In: Garoufallou, E., Sartori, F. (eds) Metadata and Semantic Research. MTSR 2023. Communications in Computer and Information Science, vol 2048. Springer, Cham. https://doi.org/10.1007/978-3-031-65990-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-65990-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-65989-8
Online ISBN: 978-3-031-65990-4
eBook Packages: Computer ScienceComputer Science (R0)