Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1040))

Included in the following conference series:

Abstract

The quality of learnt natural language grammars can be enhanced by exploiting the linguistic devices that comprise a corpus. This paper considers one such device, namely punctuation. After briefly considering the linguistics of punctuation, a model capturing some of these properties is presented. Following this, a series of experiments learning unification-based natural language grammars, using the Spoken English Corpus as data, demonstrate that even a simple model of punctuation increases the plausibility of learnt grammars over grammars learnt without the use of punctuation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hiyan Alshawi, editor. The CORE Language Engine. The MIT Press, 1992.

    Google Scholar 

  2. Robert C. Berwick. The acquisition of syntactic knowledge. MIT Press, 1985.

    Google Scholar 

  3. Ezra Black, Roger Garside, and Geoffrey Leech, editors. Statistically driven computer grammars of English the IBM-Lancaster approach. Rodopi, 1993.

    Google Scholar 

  4. Ted Briscoe and Nick Waegner. Robust Stochastic Parsing Using the Inside-Outside Algorithm. In Proceedings of the AAAI Workshop on Statistically-based Techniques in Natural Language Processing, 1992.

    Google Scholar 

  5. Glenn Carroll and Eugene Charniak. Two Experiments on Learning Probabilistic Dependency Grammars from Corpora. In AAAI-92 Workshop Program: Statistically-Based NLP Techniques, San Jose, California, 1992.

    Google Scholar 

  6. John Carroll, Claire Grover, Ted Briscoe, and Bran Boguraev. A Development Environment for Large Natural Language Grammars. Technical report number 127, University of Cambridge Computer Laboratory, 1988.

    Google Scholar 

  7. Claudia Casadio. Semantic Categories and the Development of Categorial Grammars. In Richard T. Oehrle, editor, Categorial Grammars and Natural Language Structures, pages 95–123. D. Reidel, 1988.

    Google Scholar 

  8. K. Church and R. Patil. Coping with syntactic ambiguity or how to put the block in the box on the table. Computational Linguistics, 8:139–49, 1982.

    Google Scholar 

  9. D.R. Dowty, R.E. Wall, and S. Peters. Introduction to Montague Semantics. D. Reidel Publishing Company, 1981.

    Google Scholar 

  10. G. Gadzar, E. Klein, G.K. Pullum, and I.A. Sag. Generalized Phrase Structure Grammar. Harvard University Press, 1985.

    Google Scholar 

  11. E. M. Gold. Language Identification to the Limit. Information and Control, 10:447–474, 1967.

    Article  Google Scholar 

  12. Claire Grover, Ted Briscoe, John Carroll, and Bran Boguraev. The Alvey Natural Language Tools Grammar (Second Release). Technical report number 162, University of Cambridge Computer Laboratory, 1989.

    Google Scholar 

  13. Philip Harrison, Steven Abney, Ezra Black, Dan Flickinger, Ralph Grishman Claudia Gdaniec, Donald Hindle, Robert Ingria, Mitch Marcus, Beatrice Santorini, and Tomek Strzalkowski. Natural Language Processing Systems Evaluation Workshop, Technical Report rl-tr-91-362. In Jeannette G. Neal and Sharon M. Walter, editors, Evaluating Syntax Performance of Parser/Grammars of English, 1991.

    Google Scholar 

  14. Ray S. Jackendoff. X-Bar Syntax: A Study of Phrase Structure. The M.I.T Press, 1977.

    Google Scholar 

  15. S. Johansson, G. Leech, and H. Goodluck. Manual of Information to Accompany the Lancaster-Oslo/Bergen Corpus of British English, for Use with Digital Computers. Technical report, Department of English, University of Oslo, 1978.

    Google Scholar 

  16. Bernard E. M. Jones. Can Punctuation Help Parsing? In 15 th International Conference on Computational Linguistics, Kyoto, Japan, 1994.

    Google Scholar 

  17. Fanny Leech. An approach to probabilistic parsing. MPhil Dissertation, 1987. University of Lancaster.

    Google Scholar 

  18. Geoffrey Leech and Roger Garside. Running a grammar factory: The production of syntactically analysed corpora or “treebanks”. In Stig Johansson and Anna-Brita Stenström, editors, English Computer Corpora: Selected Papers and Research Guide. Mouten de Gruyter, 1991.

    Google Scholar 

  19. David Magerman and Carl Weir. Efficiency, Robustness and Accuracy in Picky Chart Parsing. In Proceedings of the 30th ACL, University of Delaware, Newark, Delaware, pages 40–47, 1992.

    Google Scholar 

  20. David M. Magerman. Natural Language Parsing as Statistical Pattern Recognition. PhD thesis, Stanford University, February 1994.

    Google Scholar 

  21. T. Mitchell, R. Keller, and S. Kedar-Cabelli. Explanation-based generalization: A unifying view. Machine Learning, 1.1:47–80, 1986.

    Google Scholar 

  22. G. Nunberg. The linguistics of punctuation. Center for the Study of Language and Information, 1990.

    Google Scholar 

  23. Miles Osborne. Learning Unification-based Natural Language Grammars. PhD thesis, University of York, September 1994.

    Google Scholar 

  24. Miles Osborne and Derek Bridge. Learning unification-based grammars using the Spoken English Corpus. In Grammatical Inference and Applications, pages 260–270. Springer Verlag, 1994.

    Google Scholar 

  25. Miles Osborne and Derek Bridge. More for Less: Learning a Wide Coverage Grammar from a Small Training Set. In International Conference on New Methods in Language Processing. Centre for Computational Linguistics, UMIST, Manchester, 1994.

    Google Scholar 

  26. Fernando Pereira and Yves Schabes. Inside-outside reestimation from partially bracketed corpora. In Proceedings of the 30th ACL, University of Delaware, Newark, Delaware, pages 128–135, 1992.

    Google Scholar 

  27. S. M. Shieber, H. Uszkoreit, F. C. N. Pereira, and M. Tyson. The Formalism and Implementation of PATR-II. In Research on Interactive Aquisition and Use of Knowledge. SRI International, Menlo Park, California, 1983.

    Google Scholar 

  28. Lydia White. Universal Grammar and second language acquisition. John Benjamins Publishing Company, 1989.

    Google Scholar 

  29. S. J. Young and H. H. Shih. Computer Assisted Grammar Construction. In Grammatical Inference and Applications, pages 282–290. Springer Verlag, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Stefan Wermter Ellen Riloff Gabriele Scheler

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Osborne, M. (1996). Can punctuation help learning?. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_62

Download citation

  • DOI: https://doi.org/10.1007/3-540-60925-3_62

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60925-4

  • Online ISBN: 978-3-540-49738-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics