Abstract
The lack of a pattern widely accepted in Corpus Linguistics for the construction of text corpora has resulted in multiple possible paths to compilation, resulting in research products that are difficult to interface and have low continuity. The variety of existing corpora aggravates this problem since individual studies use different forms of expression to describe and solve similar problems. Thus, a gap is identified in the production initiatives to create corpora regarding the “exploratory” way of conducting such initiatives. Based on the Lapelinc Workflow for the construction of historical corpora, the Lapelinc Framework presents a working pattern for construction and research activities involving Portuguese language corpora, thereby establishing a set of steps and by-products common to corpora-based linguistic research. Providing a set of tools in the Import Stage with a reduced learning curve and allowing the convergence of efforts during development, the Lapelinc Framework tools add functionalities for multiple purposes in the construction of corpora, simplifying the construction of multi-type corpora, facilitated by the resources it makes available.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Portela, M.: Humanidades digitais: as humanidades na era da Web 2.0. Rua Larga - Rectory Magazine of the University of Coimbra, vol. 147 (2013)
Magalhães, I.L., Pinheiro, W.B.: Gerenciamento de Serviços de TI na Prática: uma abordagem com base na ITIL. Novatec, São Paulo (2007)
Costa, B.S., Santos, J.V., Namiuti, C.: Uma proposta metodológica para a construção de corpora através de estruturas de trabalho: O LAPELINC FRAMEWORK, In: II International Congress of Digital Humanities - HDRIO 2020/2021. HDRio, Rio de Janeiro (2021)
Sardinha, T.B., Leila, B.: Brazilian journal in applied linguistics. Belo Horizonte 5, 5 (2005)
Costa, B.S.: Um framework integrado para a criação, o gerenciamento e a disponibilização de corpora digitais em língua portuguesa [Doctoral thesis project]. State University of Southwest Bahia, Vitória da Conquista (2019)
Santos, J.V., Namiuti, C.: O futuro das humanidades digitais é o passado. In: V CILH - International Congress of Historical Linguistics, São Paulo (2017)
Kroch, A., Taylor, A.: The Penn-Helsinki Parsed Corpus of Middle English (PPCME2). http://www.ling.upenn.edu/ppche-release-2016/PPCME2-RELEASE-4
CRPC. http://www.clul.ulisboa.pt/en/10-research/713-crpc-reference-corpus-of-contemporary-portuguese
NILC - São Carlos. http://www.nilc.icmc.usp.br/nilc/index.php
Fayad, M.E., Schmidt, D.C., JOHNSON, R. E.: Building Application Frameworks: Object-Oriented Foundations of Framework Design. Wiley, New York (1999)
Acknowledgment
We thank FAPESB and CNPq as this work is linked to thematic projects funded by FAPESB (APP0007/2016 and APP0014/2016) and CNPq (436209/2018-7); the Graduate Program in Linguistics (PPGLIN); to the Corpus Linguistics Research Laboratory (LAPELINC); to the State University of Southwest Bahia – UESB; and advisors Prof. Dr. Jorge Viana Santos and Prof. Dr. Cristiane Namiuti.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Costa, B.S., Santos, J.V., Namiuti, C., Costa, A.S. (2022). The Systematic Construction of Multiple Types of Corpora Through the Lapelinc Framework. In: Pinheiro, V., et al. Computational Processing of the Portuguese Language. PROPOR 2022. Lecture Notes in Computer Science(), vol 13208. Springer, Cham. https://doi.org/10.1007/978-3-030-98305-5_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-98305-5_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98304-8
Online ISBN: 978-3-030-98305-5
eBook Packages: Computer ScienceComputer Science (R0)