Skip to main content

On the Usage of Morphological Tags for Grammar Induction

  • Conference paper
MICAI 2007: Advances in Artificial Intelligence (MICAI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4827))

Included in the following conference series:

Abstract

We present a study on the effect of adding morphological tags to the training corpus of a grammar inductor. For this purpose, we carried out several experiments using the grammar induction system called Alignment-Based Learning (ABL) and the CAST-3LB syntactically tagged Spanish corpus for training and testing. ABL produces a set of possible constituents with a word alignment process. We developed an algorithm which converts the hypotheses generated by ABL into ordered production rules. Then our algorithm groups them into possible phrase groups (constituents). These phrase groups correspond to the syntactic tagging of the unannotated text. We compared the phrase groups obtained by our algorithm with the manually tagged groups of CAST-3LB. The experiments in the grammar induction process consisted on trying three different variants for the training corpus: (1) using words; (2) using only the morphological tags; and (3) adding morphological tags to words. Our experiments show that the inclusion of morphological tags in the grammar induction process improves significantly the performance of ABL.

Work done under support of the Mexican Government (CONACYT, SNI, PIFI, and SIP-IPN).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bonnema, R., Bod, R., Scha, R.: A DOP model for semantic interpretation. In: Proceedings of the Association for Computational Linguistics/European Chapter of the Association for Computational Linguistics, Madrid, pp. 159–167 (1997)

    Google Scholar 

  2. Charniak, E.: A Maximun-Entropy-Inspired Parser. In: Proceedings of NAACL-2000 (2000)

    Google Scholar 

  3. Dupont, P.: Grammatical Inference: Formal and Heuristics Methods. Carnegie Mellon University (1997)

    Google Scholar 

  4. Geertzen, J., van Zaanen, M.: Alignment-Based Learning Reference Guide. Thecnical Report, Macquarie University (2006)

    Google Scholar 

  5. Harris, S.Z.: Structural Linguistic. University of Chicago Press, Chicago (2000)

    Google Scholar 

  6. Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press, Cambridge (2000)

    Google Scholar 

  7. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotaded Corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)

    Google Scholar 

  8. Navarro, B., Civit, M., Antonia Martí, M., Marcos, R., Fernández, B.: Syntactic, semantic and pragmatic annotation in Cast3LB. In: Shallow Processing of Large Corpora (SProLaC), a Workshop of Corpus Linguistics, Lancaster, UK (2003)

    Google Scholar 

  9. van Zaanen, M., Adriaans, P.: Alignment-Based Learning versus EMILE: A Comparison. In: Krose, B., de Rijke, M., Schreiber, G., Van Someren, M. (eds.) BNAIC 2001. Proceedings of the Belgian-Dutch Conference on Artificial Intelligence, Amsterdam, The Netherlands, pp. 315–322 (October 25-26, 2001)

    Google Scholar 

  10. van Zaanen, M.: ABL: Alignment-Based Learning. In: COLING 2000, pp. 961–967 (2000)

    Google Scholar 

  11. van Zaanen, M.: Bootstrapping Structure into Language: Alignment-Based Learning. PhD Thesis, School of Computing, University of Leeds, U.K (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh Ángel Fernando Kuri Morales

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Juárez Gambino, O., Calvo, H. (2007). On the Usage of Morphological Tags for Grammar Induction. In: Gelbukh, A., Kuri Morales, Á.F. (eds) MICAI 2007: Advances in Artificial Intelligence. MICAI 2007. Lecture Notes in Computer Science(), vol 4827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76631-5_87

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76631-5_87

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76630-8

  • Online ISBN: 978-3-540-76631-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics