Skip to main content

Studying the Advantages of a Messy Evolutionary Algorithm for Natural Language Tagging

  • Conference paper
  • First Online:
Genetic and Evolutionary Computation — GECCO 2003 (GECCO 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2724))

Included in the following conference series:

Abstract

The process of labeling each word in a sentence with one of its lexical categories (noun, verb, etc) is called tagging and is a key step in parsing and many other language processing and generation applications. Automatic lexical taggers are usually based on statistical methods, such as Hidden Markov Models, which works with information extracted from large tagged available corpora. This information consists of the frequencies of the contexts of the words, that is, of the sequence of their neighbouring tags. Thus, these methods rely on the assumption that the tag of a word only depends on its surrounding tags. This work proposes the use of a Messy Evolutionary Algorithm to investigate the validity of this assumption. This algorithm is an extension of the fast messy genetic algorithms, a variety of Genetic Algorithms that improve the survival of high quality partial solutions or building blocks. Messy GAs do not require all genes to be present in the chromosomes and they may also appear more than one time. This allows us to study the kind of building blocks that arise, thus obtaining information of possible relationships between the tag of a word and other tags corresponding to any position in the sentence. The paper describes the design of a messy evolutionary algorithm for the tagging problem and a number of experiments on the performance of the system and the parameters of the algorithm.

Supported by project PR1/03-11588.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. Araujo. A parallel evolutionary algorithm for stochastic natural language parsing. In Proc. of the Int. Conf. Parallel Problem Solving from Nature (PPSNVII), 2002.

    Google Scholar 

  2. E. Brill. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21(4), 1995.

    Google Scholar 

  3. E. Charniak. Statistical Language Learning. MIT press, 1993.

    Google Scholar 

  4. D. Cutting, J. Kupiec, J. Pedersen, and P. Sibun. A practical part-of-speech tagger. In Proc. of the Third Conf. on Applied Natural Language Processing. Association for Computational Linguistics, 1992.

    Google Scholar 

  5. D.E. Goldberg, Korb B., and Deb K. Messy genetic algorithms: motivation, analysis, and first results. Complex Systems, 3:493–530, 1989.

    MATH  MathSciNet  Google Scholar 

  6. D.E. Goldberg, Korb B., and Deb K. Messy genetic algorithms revisited: Studies in mixed size and scale. Complex Systems, 4:415–444, 1990.

    MATH  Google Scholar 

  7. D.E. Goldberg, Kargupta H. Deb K., and Harik G. Rapid, accurate optimization of difficult problems using fast messy genetic algorithms. In Proc. of the Fifth International Conference on Genetic Algorithms, pages 56–64. Morgan Kaufmann Publishers, 1993.

    Google Scholar 

  8. D.E. Goldberg, Deb K., and J. H. Clark. Don’t worry, be messy. In Proc. of the Fourth International Conference in Genetic Algorithms and their Applications, pages 24–30, 1991.

    Google Scholar 

  9. Georges R. Harik and David E. Goldberg. Learning linkage. In Richard K. Belew and Michael D. Vose, editors, Foundations of Genetic Algorithms 4, pages 247–262. Morgan Kaufmann, San Francisco, CA, 1997.

    Google Scholar 

  10. H. Kargupta. Search, polynomial complexity, and the fast messy genetic algorithm. Ph.D. thesis, Graduate College of the University of Illinois at Urbana-Champaign, 1996.

    Google Scholar 

  11. B. Merialdo. Tagging english text with a probabilistic model. Computational Linguistics, 20(2):155–172, 1994.

    Google Scholar 

  12. Francis W. Nelson and Henry Kucera. Manual of information to accompany a standard corpus of present-day edited american english, for use with digital computers. Technical report, Department of Linguistics, Brown University., 1979.

    Google Scholar 

  13. H. Schutze and Y. Singer. Part od speech tagging using a variable memory markov model. In Proc. of the 1994 of the Association for Computational Linguistics. Association for Computational Linguistics, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Araujo, L. (2003). Studying the Advantages of a Messy Evolutionary Algorithm for Natural Language Tagging. In: Cantú-Paz, E., et al. Genetic and Evolutionary Computation — GECCO 2003. GECCO 2003. Lecture Notes in Computer Science, vol 2724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45110-2_94

Download citation

  • DOI: https://doi.org/10.1007/3-540-45110-2_94

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40603-7

  • Online ISBN: 978-3-540-45110-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics