Abstract
One of the main problems in natural language analysis is the resolution of structural ambiguity. Prepositional Phrase (PP) attachment ambiguity is a particularly difficult case. We describe a robust PP disambiguation procedure that learns from a text corpus. The method is based on a loglinear model, a type of statistical model that is able to account for combinations of multiple categorial features. A series of experiments that compare the loglinear method against other strategies are described. For the difficult case of three possible attachment sites, the loglinear method predicts PP attachment with significantly higher accuracy than a simpler procedure that uses lexical association strengths. At the same time, on general newswire text, the accuracy of the statistical method remains 10% below the performance of human experts. This suggests a limit on what can be learned automatically from text, and points to the need to combine machine learning with human expertise.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Alan Agresti. Categorical Data Analysis. John Wiley & Sons, New York, 1990.
Y. M. Bishop, S. E. Fienberg, and P. W. Holland. Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge, MA, 1975.
Lois Boggess, Rajeev Agarwall, and Ron Davis. Disambiguation of Prepositional Phrases in automatically labelled technical text. In AAAI-91, pages 784–789, 1991.
Eric Brill and Philip Resnik. A rule-based approach to Prepositional Phrase attachment disambiguation. In Proceedings of COLING-94, pages 1198–1204, 1994.
Peter F. Brown, Vincent J. Della Pietra, Peter V. deSouza, and Robert L. Mercer. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467–480, 1990.
Stephen Crain and Mark J. Steedman. On not being led up the garden path: The use of context by the psychological syntax processor. In David R. Dowty, Lauri Karttunen, and Anrnold M. Zwicky, editors, Natural Language Parsing, pages 320–358, Cambridge, UK, 1985. Cambridge University Press.
W. E. Deming and F. F. Stephan. On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann. Math. Statis, (11):427–444, 1940.
Richard O. Duda and Peter E. Hart. Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 1973.
Stephen E. Fienberg. The Analysis of Cross-Classified Categorical Data. The MIT Press, Cambridge, MA, second edition edition, 1980.
M. Ford, J.W. Bresnan, and R. Kaplan. A competence-based theory of syntactic closure. In Joan W. Bresnan, editor, The Mental Representation of Grammatical Relations, Cambridge, MA, 1982. MIT Press.
Lyn Frazier. On Comprehending Sentences: Syntactic Parsing Strategies. PhD thesis, University of Massachusetts, Amherst, MA, 1979.
Lyn Frazier. Sentence processing: A tutorial review. In M. Coltheart, editor, Attention and Performance XII, pages 559–586, Hillsdale, NJ, 1987. Lawrence Erlbaum.
Ted Gibson and Neal Pearlmutter. A corpus-based analysis of psycholinguistic constraints on PP attachment. In Charles Clifton Jr., Lyn Frazier, and Keith Rayner, editors, Perspectives on Sentence Processing. Lawrence Erlbaum Associates, 1994.
Donald Hindle and Mats Rooth. Structural ambiguity and lexical relations. Computational Linguistics, 19(1):103–120, 1993.
Graeme Hirst. Semantic Interpretation and the Resolution of Ambiguity. Cambridge University Press, Cambridge, 1986.
Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330, 1993.
Adwait Ratnaparkhi, Jeff Rynar, and Salim Roukos. A maximum entropy model for Prepositional Phrase attachment. In ARPA Workshop on Human Language Technology, Plainsboro, NJ, March 8–11 1994.
Philip Resnik and Marti Hearst. Structural ambiguity and conceptual relations. In Proceedings of the Workshop on Very Large Corpora, pages 58–64, 1993.
Eiichiro Sumita, Osamu Furuse, and Hitoshi Iida. An example-based disambiguation of Prepositional Phrase attachment. In Fifth International Conference on Theoretical and Methodological Isues in Machine Tranlation, pages 80–91, Kyoto, Japan, 1993.
Greg Whittemore, Kathleen Ferrara, and Hans Brunner. Empirical study of predictive powers of simple attachment schemes for post-modifier Prepositional Phrases. In Proceedings of ACL-90, pages 23–30, 1990.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Franz, A. (1996). Learning PP attachment from corpus statistics. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_47
Download citation
DOI: https://doi.org/10.1007/3-540-60925-3_47
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive