skip to main content
article
Free Access

Word division in Spanish

Published:01 July 1987Publication History
Skip Abstract Section

Abstract

Spanish is a language with very precise and regular orthographic rules. A syllabication algorithm strictly based on syntactic analysis, not requiring any semantic knowledge, is presented and further extended to include hyphenation. Algorithms are presented as pattern matching schemata, and efficient implementations are considered.

References

  1. 1 Hornby, AS. Oxford Advanced Learner's Dictionary of Current English. 3rd ed. Oxford University Press, New York, 1974.Google ScholarGoogle Scholar
  2. 2 Knutb, D.E. TEX and Metafont. New Directions in Typesetting. Digital Press, Bedford, Mass., 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 Lesk. M.E., and Schmidt, E. LEX-A lexical analyzer generator. Comput. Sci. Tech. Rap. 39, Bell Laboratories, Murray Hill, N.J., Oct. 1975.Google ScholarGoogle Scholar
  4. 4 Maiias, J.A. Tratamiento previo de textos redactados en castellano. Intern. Rep. FISS-I-15.1-SF-85, Facultad de InformBtica, San Sebastian, Spain, Sept. 1985 (in Spanish).Google ScholarGoogle Scholar
  5. 5 Ossanna, J.F. nroff/troff user's manual. Comput. Sci. Tech. Rep. 54, Bell Laboratories, Murray Hill, N.J., 1976.Google ScholarGoogle Scholar
  6. 6 Real Academia Espaiiola. Esbozo de ma Nuevn Gramdlica de In Lengun Espafiota. Espasa-Calpe, S.A. Madrid, Spain, 1973.Google ScholarGoogle Scholar

Index Terms

  1. Word division in Spanish

                    Recommendations

                    Reviews

                    Mario Borillo

                    The author describes a method for the insertion of hyphenation procedures for Spanish into text processing systems, most of which were designed for use with English. He studies the specific problems that arise from the syllabic structures of Spanish as well as from the typographical conventions that Hispanic writers have traditionally adopted. The method described is primarily based on the study of Spanish orthographic and syllabic structures, which provide the first set of breaking rules. A second layer of rules translates traditional conventions of word partitioning. The third component is an aesthetic parameter: an adjustable threshold that allows the user to balance the percentage of hyphenations against the volume of spaces for a given document. This threshold introduces a pleasant flexibility in the modulation of the rules. The set of algorithms has been implemented in C and tested with the lexical analyzer generator lex, and the author gives some performance data. This work will interest people working on text processing; in a wider context, it makes a useful contribution to the current research in document structures retrieval [1]. A lot of Hispanic people should also appreciate this work, as Spanish is a major human language and the algorithms are linguistically rooted.

                    Access critical reviews of Computing literature here

                    Become a reviewer for Computing Reviews.

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    Full Access

                    • Published in

                      cover image Communications of the ACM
                      Communications of the ACM  Volume 30, Issue 7
                      July 1987
                      53 pages
                      ISSN:0001-0782
                      EISSN:1557-7317
                      DOI:10.1145/28569
                      Issue’s Table of Contents

                      Copyright © 1987 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 1 July 1987

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • article

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader