Abstract
Spanish is a language with very precise and regular orthographic rules. A syllabication algorithm strictly based on syntactic analysis, not requiring any semantic knowledge, is presented and further extended to include hyphenation. Algorithms are presented as pattern matching schemata, and efficient implementations are considered.
- 1 Hornby, AS. Oxford Advanced Learner's Dictionary of Current English. 3rd ed. Oxford University Press, New York, 1974.Google Scholar
- 2 Knutb, D.E. TEX and Metafont. New Directions in Typesetting. Digital Press, Bedford, Mass., 1979. Google ScholarDigital Library
- 3 Lesk. M.E., and Schmidt, E. LEX-A lexical analyzer generator. Comput. Sci. Tech. Rap. 39, Bell Laboratories, Murray Hill, N.J., Oct. 1975.Google Scholar
- 4 Maiias, J.A. Tratamiento previo de textos redactados en castellano. Intern. Rep. FISS-I-15.1-SF-85, Facultad de InformBtica, San Sebastian, Spain, Sept. 1985 (in Spanish).Google Scholar
- 5 Ossanna, J.F. nroff/troff user's manual. Comput. Sci. Tech. Rep. 54, Bell Laboratories, Murray Hill, N.J., 1976.Google Scholar
- 6 Real Academia Espaiiola. Esbozo de ma Nuevn Gramdlica de In Lengun Espafiota. Espasa-Calpe, S.A. Madrid, Spain, 1973.Google Scholar
Index Terms
- Word division in Spanish
Recommendations
A Word Stemming Algorithm for the Spanish Language
SPIRE '00: Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)The paper describes a word stemming algorithm for the Spanish language. Experiments in document retrieval regarding English text suggest that word stemming based on morphological analysis does not generally or consistently outperform ad-hoc hand tuned ...
Two-Word Collocation Extraction Using Monolingual Word Alignment Method
Statistical bilingual word alignment has been well studied in the field of machine translation. This article adapts the bilingual word alignment algorithm into a monolingual scenario to extract collocations from monolingual corpus, based on the fact ...
Division of Spanish Words into Morphemes with a Genetic Algorithm
NLDB '08: Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information SystemsWe discuss an unsupervised technique for determining morpheme structure of words in an inflective language, with Spanish as a case study. For this, we use a global optimization (implemented with a genetic algorithm), while most of the previous works are ...
Comments