ABSTRACT
The recent enhancement of the morphological analyser for Latin Lemlat with a large Onomasticon enables us to analyse both the morphology and the distribution of loanwords in the Latin lexicon. In this paper, first we describe the categories of proper names that were not possible to insert into Lemlat automatically, showing that a large part of them are loanwords. Then, we present the results of a qualitative analysis of loanwords to detect those 'exceptional' endings that identify loanwords featuring inflectional properties not assimilated to those regular in the morphological system of Latin. In the end, we report a quantitative analysis of data to study the frequency of such loanwords in Latin texts.
- Marco Budassi and Marco Passarotti. 2016. Nomen Omen. Enhancing the Latin Morphological Analyser Lemlat with an Onomasticon. In Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), Berlin, Germany. 90--94.Google ScholarCross Ref
- Roberto Busa. 1988. Totius Latinitatis lemmata quae ex Aeg. Forcellini Patavina editione 1940 a fronte, a tergo atque morphologice opera IBM automati ordinaverat Robertus Busa SJ. Istituto Lombardo, Accademia di scienze e lettere, Milano.Google Scholar
- Gregory Crane. 1991. Generating and parsing classical Greek. Literary and Linguistic Computing 6, 4 (1991), 243--245. Google ScholarCross Ref
- Egidio Forcellini. 1940. Lexicon Totius Latinitatis / ad Aeg. Forcellini lucubratum, dein a Jos. Furlanetto emendatum et auctum; nunc demum Fr. Corradini et Jos. Perin curantibus emendatius et auctius meloremque informam redactum adjecto altera quasi parte Onomastico totius latinitatis opera et studio ejusdem Jos. Perin. Typis Seminarii, Padova.Google Scholar
- Karl E Georges and Heinrich Georges. 1913--1918. Ausführliches Lateinisch-Deutsches Handwörterbuch. Hahn, Hannover.Google Scholar
- Peter GW Glare. 1982. Oxford latin dictionary. Clarendon Press. Oxford University Press, Oxford.Google Scholar
- Otto Gradenwitz. 1904. Laterculi Vocum Latinarum. Hirzel, Leipzig.Google Scholar
- Roberto Gusmani. 1973. Aspetti delprestito linguistico. Libreria scientifica editrice, Napoli.Google Scholar
- Roberto Gusmani. 1973. Di alcuni presunti prestiti greci in latino. BSL 3 (1973), 76--88.Google Scholar
- Marco Passarotti. 2004. Development and perspectives of the Latin morphological analyser LEMLAT. Linguistica Computazionale 20, A (2004), 397--414.Google Scholar
- Sarah Grey Thomason and Terrence Kaufman. 1992. Language contact, creolization, and genetic linguistics. University of California Press, Berkeley.Google Scholar
- Paul Tombeur. 1998. Thesaurus formarum totius Latinitatis: a Plauto usque ad saeculum XXum; TF.[2]. CETEDOC Index of Latin forms: database for the study of the vocabulary of the entire Latin world; base de données pour l'étude du vocabulaire de toute la latinité. Brepols, Turnhout.Google Scholar
- Margaret MT Watmough. 1997. Studies in the Etruscan loanwords in Latin. Vol. 33. Olschki, Firenze.Google Scholar
Index Terms
- The Impact of Unassimilated Loanwords on the Latin Lexicon. A Qualitative and Quantitative Analysis
Recommendations
A novel unsupervised corpus-based stemming technique using lexicon and corpus statistics
AbstractWord Stemming is a widely used mechanism in the fields of Natural Language Processing, Information Retrieval, and Language Modeling. Language-independent stemmers discover classes of morphologically related words from the ambient ...
An unsupervised method for identifying loanwords in Korean
This paper presents an unsupervised method for developing a character-based n-gram classifier that identifies loanwords or transliterated foreign words in Korean text. The classifier is trained on an unlabeled corpus using the Expectation Maximization ...
Impact of Morphological Segmentation on Pre-trained Language Models
Intelligent SystemsAbstractPre-trained Language Models are the current state-of-the-art in many natural language processing tasks. These models rely on subword-based tokenization to solve the problem of out-of-vocabulary words. However, commonly used subword segmentation ...
Comments