Abstract
We study most frequent Spanish verb-noun combinations retrieved from the Spanish Web Corpus. We present the statistics of these combinations and analyze the degree of cohesiveness of their components. For the verb-noun combinations which turned out to be collocations, we determined their semantics in the form of lexical functions. We also observed what word senses are most typical for polysemous words in the verb-noun combinations under study and determined the level of generalization which characterizes the semantics of words in the combinations, that is, at what level of the hyperonymy-hyponymy tree they are located. The data collected by us can be used in various applications of natural language processing, especially, in predictive models in which most frequent cases are taken into account.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cambria, E., Poria, S., Gelbukh, A., Kwok, K.: Sentic API: a common-sense based API for concept-level sentiment analysis. In: Proceedings of the 4th Workshop on Making Sense of Microposts, co-located with the 23rd International World Wide Web Conference (WWW 2014). CEUR Workshop Proceedings, vol. 1141, pp. 19–24 (2014)
Cambria, E., Poria, S., Bajpai, R., Schuller, B.: SenticNet 4: A semantic resource for sentiment analysis based on conceptual primitives. In COLING 2016, The 26th International Conference on Computational Linguistics, pp. 2666–2677 (2016)
Chikersal, P., Poria, S., Cambria, E., Gelbukh, A., Siong, C.E.: Modelling public sentiment in Twitter: using linguistic patterns to enhance supervised learning. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9042, pp. 49–65. Springer, Cham (2015). doi:10.1007/978-3-319-18117-2_4
Derczynski, L., Lukasik, M., Srijith, P.K., Bontcheva, K., Hepple, M., Lobo, T.P., Radzimski, M.: D6. 2.1 Evaluation Report-Interim Results (2016)
Fontenelle, T.: Using lexical functions to discover metaphors. In: Proceedings of the 6th EURALEX International Congress, pp. 271–278 (1994)
Fontenelle, T.: Ergativity, collocations and lexical functions. In: Gellerstam, M., et al. (eds.), pp. 209–222 (1996)
Fontenelle, T.: Using a bilingual dictionary to create semantic networks. Int. J. Lexicogr. 10(4), 275–303 (1997)
Hausmann, F.J.: Un dictionnaire des collocations est-il possible? Travaux de Linguistique et de Littérature Strasbourg 17(1), 187–195 (1979)
Hausmann, F.J.: Was sind eigentlich Kollokationen. In: Wortverbindungen-mehr oder weniger fest, pp. 309–334 (2004)
Kahane, S., Polguere, A.: Formal foundation of lexical functions. In: Proceedings of ACL/EACL 2001 Workshop on Collocation, pp. 8–15 (2001)
Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Suchomel, V.: The sketch engine: ten years on. Lexicography 1(1), 7–36 (2014)
Lemnitzer, L., Geyken, A.: Semantic modeling of collocations for lexicographic purposes. J. Cogn. Sci. 16(3), 200–223 (2015)
Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning based document modeling for personality detection from text. IEEE Intell. Syst. 32(2), 74–79 (2017)
Mel’čuk, I.A.: Lexical functions: a tool for the description of lexical relations in a lexicon. In: Wanner, L. (ed.) Lexical Functions in Lexicography and Natural Language Processing, pp. 37–102. John Benjamins Academic Publishers, Amsterdam and Philadelphia (1996)
Mel’čuk, I.A.: Collocations and lexical functions. In: Cowie, A.P. (ed.) Phraseology. Theory, Analysis, and Applications, pp. 25–53. Clarendon Press, Oxford (1998)
Mel’čuk, I.A.: Semantics: From Meaning to Text, vol. 3. John Benjamins Publishing Company, Amsterdam and Philadelphia (2015)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Miller, G.A., Leacock, C., Tengi, R., Bunker, R.T.: A semantic concordance. In: Proceedings of the Workshop on Human Language Technology Association for Computational Linguistics, pp. 303–308 (1993)
Nakagawa, H., Mori, T.: Automatic term recognition based on statistics of compound nouns and their components. Terminology 9(2), 201–219 (2003)
Pakray, P., Pal, S., Poria, S., Bandyopadhyay, S., Gelbukh, A.: JU_CSE_TAC: textual entailment recognition system at TAC RTE-6. In: System Report. Text Analysis Conference, Recognizing Textual Entailment Track (TAC RTE). Notebook (2010)
Pakray, P., Neogi, S., Bhaskar, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: A textual entailment system using anaphora resolution. In: System Report. Text Analysis Conference, Recognizing Textual Entailment Track (TAC RTE). Notebook (2011a)
Pakray, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: Semantic textual entailment recognition using UNL. POLIBITS 43, 23–27 (2011)
Poria, S., Gelbukh, A., Agarwal, B., Cambria, E., Howard, N.: Common sense knowledge based personality recognition from text. In: Castro, F., Gelbukh, A., González, M. (eds.) MICAI 2013. LNCS, vol. 8266, pp. 484–496. Springer, Heidelberg (2013a). doi:10.1007/978-3-642-45111-9_42
Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., Bandyopadhyay, S.: Enhanced SenticNet with affective labels for concept-based opinion mining. IEEE Intell. Syst. 28(2), 31–38 (2013b)
Poria, S., Cambria, E., Gelbukh, A., Bisio, F., Hussain, A.: Sentiment data flow analysis by means of dynamic linguistic patterns. IEEE Comput. Intell. Mag. 10(4), 26–36 (2015)
Poria, S., Cambria, E., Hazarika, D., Vij, P.: A deeper look into sarcastic tweets using deep convolutional neural networks. In: The 26th International Conference on Computational Linguistics, COLING 2016, pp. 1601–1612 (2016a)
Poria, S., Cambria, E., Gelbukh, A.: Aspect extraction for opinion mining with a deep convolutional neural network. Knowl.-Based Syst. 108, 42–49 (2016b)
Poria, S., Chaturvedi, I., Cambria, E., Hussain, A.: Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 439–448 (2016c)
Poria, S., Cambria, E., Bajpai, R., Hussain, A.: A review of affective computing: from unimodal analysis to multimodal fusion. Inf. Fusion 37, 98–125 (2017a)
Poria, S., Peng, H., Hussain, A., Howard, N., Cambria, E.: Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis. Neurocomputing (2017b, in press)
Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: a pain in the neck for NLP. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 1–15. Springer, Heidelberg (2002). doi:10.1007/3-540-45715-1_1
Schmid, H.: Improvements in part-of-speech tagging with an application to German. In: Proceedings of the ACL SIGDAT-Workshop (1995)
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: New Methods in Language Processing, p. 154. Routledge (2013)
Sharoff, S.: Creating general-purpose corpora using automated search engine queries. In: WaCky, pp. 63–98 (2006)
Song, S.H.: Zur Korrespondenz der NV-Kollokationen im Deutschen und Koreanischen. 언어학 44, 37–57 (2006)
Volk, M., Scheider, G.: Comparing a statistical and a rule-based tagger for German. In: Computers, Linguistics, and Phonetics Between Language and Speech. Proceedings of 4th Conference on Natural Language Processing-KONVENS 1998 (1998)
Woodroofe, M., Hill, B.: On Zipf’s law. J. Appl. Prob. 12, 425–434 (1975)
Acknowledgements
The authors are grateful to Vojtěch Kovář for providing us with the list of most frequent verb-noun pairs from the Spanish Web Corpus of the Sketch Engine, www.sketchengine.co.uk. The authors also appreciate the support of Mexican Government which made it possible to complete this work: SNI-CONACYT, BEIFI-IPN, SIP-IPN: grants 20162064, 20161958, and 20162204, and the EDI Program. We give special thanks to Dr. Noé Alejandro Castro-Sánchez for collecting the statistics of verb senses in Diccionario de la lengua española (DRAE).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kolesnikova, O., Gelbukh, A. (2017). Characteristics of Most Frequent Spanish Verb-Noun Combinations. In: Sidorov, G., Herrera-Alcántara, O. (eds) Advances in Computational Intelligence. MICAI 2016. Lecture Notes in Computer Science(), vol 10061. Springer, Cham. https://doi.org/10.1007/978-3-319-62434-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-62434-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62433-4
Online ISBN: 978-3-319-62434-1
eBook Packages: Computer ScienceComputer Science (R0)