Abstract
This paper describes a methodology for automatically identifying proverbs and their variants in running texts. This methodology is based on existing compilations of proverbs, by exploring the regular syntactic structures that most proverbs present and intersecting syntactic structure with the lexical units of the proverbs. From the syntactic regularities we divided the data into 13 different classes. Finite-state automata is used to represent the regular patterns found in the classes. The results showed a precision rate of 74.68% tested in Brazilian Portuguese journalistic corpus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brotons, M.L.N.: Las paremias y sus variantes: Análisis sintáctico, semántico y traductológico español/francés. Ph.D. thesis. Universidad de Alicante, Alicante, Spain (2008)
Bruckschein, M., Muniz, F., Souza, J.G.C., Fuchs, J.T., Infante, K., Muniz, M.: Gonçalez, P.N., Vieira, R., Aluisio, S.M.: Anotação linguística em xml do corpus PLN-BR. Série de relatórios do NILC, ICMC - USP (2008)
Bungum, L., Gambäck, B., Lynum, A., Marsi, E.: Improving word translation disambiguation by capturing multiword expressions with dictionaries. In: Proceedings of the 9th Workshop on Multiword Expression, Atlanta, Georgia, USA, pp. 21–30 (June 2013)
Caseli, H.M., Ramisch, C., Nunes, M.G.V., Villavicencio, A.: Alignment-based extraction of multiword expressions. Language Resources and Evaluation - Special Issue on Multiword Expression: Hard Going or Plain Sailing, 59–77 (2010)
Chacoto, L.: Estudo e formalização das propriedades léxico-sintácticas de expressões fixas proverbiais. Master’s thesis. Faculdade de Letras da Universidade de Lisboa (1994)
Chacoto, L.: A sintaxe dos provérbios – As estruturas Quem/Quien en portugués e español. Cadernos de Fraseoloxía Galega 9, 31–53 (2007)
Chacoto, L.: Vale mais um gosto na vida que três vinténs na algibeira - Las estructuras comparativas en los proverbios portugueses. In: Aspectos Formales y Discursivos de Las Expresiones Fijas, pp. 87–103 (2008)
Conenna, M.: Sur un lexique-grammaire comparé de proverbes. Langages - Les expressions figées 90, 99–116 (1988)
Conenna, M.: Acerca del tratamiento informático de los proverbios. In: Léxico y Fraseología pp. 197–204 (1998)
Conenna, M.: Classement et traitement automatique des proverbes français et italiens. In: Lexique, Syntaxe et Sémantique: Mélanges offerts à Gaston Gross à l’occasion de son soixantième anniversaire. Special issue of Linguisticae Investigationes. BULAG, Numéro hors série, pp. 285–294 (2000)
Conenna, M.: Dictionnaire électronique de proverbes français et italiens. In: Englebert, A. (ed.) Actes du XXIIe Congrès International de Linguistique et de Philologie Romanes, pp. 137–145. Max Niemeyer Verlag, Bruxelles (2000)
Conenna, M.: Principes d’analyse automatique des proverbes. In: Leclère, C. (ed.) Syntax, Lexis & Lexicon-Grammar, Papers in honour of Maurice Gross, pp. 91–103. John Benjamins Publishing, Amsterdam (2004)
Gross, M.: Une classification des phrases figées du français. Révue Québécoise de Linguistique 11(2), 151–185 (1982)
Kordoni, V., Ramisch, C., Villavicencio, A. (eds.): Proceedings of the ACL Workshop on Multiword Expressions: From Parsing and Generation to the Real World (MWE 2011), Portland, OR, USA (June 2011)
Kordoni, V., Ramisch, C., Villavicencio, A. (eds.): Proceedings of the 9th Workshop on Multiword Expression, Atlanta, Georgia, USA (June 2013)
Lacavalla, C.B.: Lexique-grammaire des proverbes en Quand/Quando - Comparaison français-italien et représentation par grammaires locales. Ph.D. thesis. Universitá degli Studi di Bari, Bari, Itália (2007)
Laporte, É., Nakov, P., Ramisch, C., Villavicencio, A. (eds.): Proceedings of the COLING Workshop on Multiword Expressions: from Theory to Applications (MWE 2010), Beijing, China (August 2010)
Lopes, A.C.M.: Texto Proverbial Português - Elementos para u.ma análise semântica e pragmática. Ph.D. thesis. Universidade de Coimbra, Coimbra (1992)
Machado, J.P.: O grande livro dos provérbios. Coleção Estante Editorial. Editorial Notícias (1998)
Magalhães, Jr., R.M.: Dicionário brasileiro de provérbios, locuções e ditos curiosos: bem como de curiosidades verbais, frases feitas, ditos históricos e citações literárias, de curso corrente na língua falada e escrita. Documentário, 3rd edn., Rio de Janeiro (1974)
Palmer, M.: Complex predicates are multi-word expressions. In: Proceedings of the 9th Workshop on Multiword Expression, Atlanta, Georgia, USA, p. 31 (June 2013)
Paumier, S.: De la reconnaissance des formes linguistiques à l’analyse syntaxique. Ph.D. thesis. Université de Marne-la-Vallée (2003)
Paumier, S.P.: Unitex 3.1 - Manuel d’Utilisation (last version) edn. (2013)
Pinto, C.A.: Livro dos provérbios, ditados, ditos populares e anexins, 4th edn. Senac, São Paulo (2003)
Rassi, A.P., Baptista, J., Vale, O.: Automatic detection of proverbs and their variants. In: Proceedings of the III Symposium on Languages Technologies and Applications (SLATE 2014), Bragança, Portugal (June 2014)
Reis, S.M.M.: A correspondência entre provérbios e expressões fixas no Português Europeu. Master’s thesis. Universidade do Algarve (2014)
Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: A Pain in the Neck for NLP. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 1–15. Springer, Heidelberg (2002)
Sidhu, B.K., Singh, A., Goyal, V.: Identification of proverbs in Hindi text corpus and their translation into Punjabi. Journal of Computer Science and Engineering 2(1), 32–37 (2010)
Steinberg, M.: 1001 provérbios em contraste: Provérbios ingleses e brasileiros. Ática, São Paulo (1985)
Teixeira, J.: Mecanismos metafóricos e mecanismos cognitivos: Provérbios e publicidade. In: Actas del VI Congreso de Lingüística General, Madrid, pp. 2271–2280 (2007)
Teixeira, N.C.: O grande livro de provérbios. Leitura, Belo Horizonte (1942)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Rassi, A., Baptista, J., Vale, O. (2014). Proverb Variation: Experiments on Automatic Detection in Brazilian Portuguese Texts. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-09761-9_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09760-2
Online ISBN: 978-3-319-09761-9
eBook Packages: Computer ScienceComputer Science (R0)