Skip to main content

Learning PP attachment from corpus statistics

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1040))

Abstract

One of the main problems in natural language analysis is the resolution of structural ambiguity. Prepositional Phrase (PP) attachment ambiguity is a particularly difficult case. We describe a robust PP disambiguation procedure that learns from a text corpus. The method is based on a loglinear model, a type of statistical model that is able to account for combinations of multiple categorial features. A series of experiments that compare the loglinear method against other strategies are described. For the difficult case of three possible attachment sites, the loglinear method predicts PP attachment with significantly higher accuracy than a simpler procedure that uses lexical association strengths. At the same time, on general newswire text, the accuracy of the statistical method remains 10% below the performance of human experts. This suggests a limit on what can be learned automatically from text, and points to the need to combine machine learning with human expertise.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alan Agresti. Categorical Data Analysis. John Wiley & Sons, New York, 1990.

    Google Scholar 

  2. Y. M. Bishop, S. E. Fienberg, and P. W. Holland. Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge, MA, 1975.

    Google Scholar 

  3. Lois Boggess, Rajeev Agarwall, and Ron Davis. Disambiguation of Prepositional Phrases in automatically labelled technical text. In AAAI-91, pages 784–789, 1991.

    Google Scholar 

  4. Eric Brill and Philip Resnik. A rule-based approach to Prepositional Phrase attachment disambiguation. In Proceedings of COLING-94, pages 1198–1204, 1994.

    Google Scholar 

  5. Peter F. Brown, Vincent J. Della Pietra, Peter V. deSouza, and Robert L. Mercer. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467–480, 1990.

    Google Scholar 

  6. Stephen Crain and Mark J. Steedman. On not being led up the garden path: The use of context by the psychological syntax processor. In David R. Dowty, Lauri Karttunen, and Anrnold M. Zwicky, editors, Natural Language Parsing, pages 320–358, Cambridge, UK, 1985. Cambridge University Press.

    Google Scholar 

  7. W. E. Deming and F. F. Stephan. On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann. Math. Statis, (11):427–444, 1940.

    Google Scholar 

  8. Richard O. Duda and Peter E. Hart. Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 1973.

    Google Scholar 

  9. Stephen E. Fienberg. The Analysis of Cross-Classified Categorical Data. The MIT Press, Cambridge, MA, second edition edition, 1980.

    Google Scholar 

  10. M. Ford, J.W. Bresnan, and R. Kaplan. A competence-based theory of syntactic closure. In Joan W. Bresnan, editor, The Mental Representation of Grammatical Relations, Cambridge, MA, 1982. MIT Press.

    Google Scholar 

  11. Lyn Frazier. On Comprehending Sentences: Syntactic Parsing Strategies. PhD thesis, University of Massachusetts, Amherst, MA, 1979.

    Google Scholar 

  12. Lyn Frazier. Sentence processing: A tutorial review. In M. Coltheart, editor, Attention and Performance XII, pages 559–586, Hillsdale, NJ, 1987. Lawrence Erlbaum.

    Google Scholar 

  13. Ted Gibson and Neal Pearlmutter. A corpus-based analysis of psycholinguistic constraints on PP attachment. In Charles Clifton Jr., Lyn Frazier, and Keith Rayner, editors, Perspectives on Sentence Processing. Lawrence Erlbaum Associates, 1994.

    Google Scholar 

  14. Donald Hindle and Mats Rooth. Structural ambiguity and lexical relations. Computational Linguistics, 19(1):103–120, 1993.

    Google Scholar 

  15. Graeme Hirst. Semantic Interpretation and the Resolution of Ambiguity. Cambridge University Press, Cambridge, 1986.

    Google Scholar 

  16. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330, 1993.

    Google Scholar 

  17. Adwait Ratnaparkhi, Jeff Rynar, and Salim Roukos. A maximum entropy model for Prepositional Phrase attachment. In ARPA Workshop on Human Language Technology, Plainsboro, NJ, March 8–11 1994.

    Google Scholar 

  18. Philip Resnik and Marti Hearst. Structural ambiguity and conceptual relations. In Proceedings of the Workshop on Very Large Corpora, pages 58–64, 1993.

    Google Scholar 

  19. Eiichiro Sumita, Osamu Furuse, and Hitoshi Iida. An example-based disambiguation of Prepositional Phrase attachment. In Fifth International Conference on Theoretical and Methodological Isues in Machine Tranlation, pages 80–91, Kyoto, Japan, 1993.

    Google Scholar 

  20. Greg Whittemore, Kathleen Ferrara, and Hans Brunner. Empirical study of predictive powers of simple attachment schemes for post-modifier Prepositional Phrases. In Proceedings of ACL-90, pages 23–30, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Stefan Wermter Ellen Riloff Gabriele Scheler

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Franz, A. (1996). Learning PP attachment from corpus statistics. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_47

Download citation

  • DOI: https://doi.org/10.1007/3-540-60925-3_47

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60925-4

  • Online ISBN: 978-3-540-49738-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics