Skip to main content

Discriminative Models of SCFG and STSG

  • Conference paper
Book cover Text, Speech and Dialogue (TSD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3206))

Included in the following conference series:

  • 871 Accesses

Abstract

Standard stochastic grammars use generative probabilistic models, focusing on rewriting probabilities conditioned by the symbol to be rewritten. Among several other undesired behaviors, such grammars tend to give penalty to longer derivations of the same input, which is a drawback when they are used for analysis (rather than generation). In this contribution, we propose a novel non-generative probabilistic model for both Stochastic Context-Free Grammars (SCFGs) and Stochastic Tree-Substitution Grammars (STSGs), in which the probabilities are conditioned by the leaves (i.e. the input symbols) rather than by the root of the parse tree. Both theoretical and experimental improvements of these new models are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Johnson, M.: PCFG models of linguistic tree representations. Computational Linguistics 24, 613–632 (1998)

    Google Scholar 

  2. Bonnema, R., Buying, P., Scha, R.: A new probabilitymodel for Data Oriented Parsing. In: Dekker, P., Kerdiles, G. (eds.) Proc. of the 12th Amsterdam Colloquium (1999)

    Google Scholar 

  3. Rozenknop, A.: Gibbsian Context-Free Grammar for parsing. In: Sojka, P., Kopeček, I., Pala, K. (eds.) Proc. Text, Speech and Dialogue 2002, Springer, Heidelberg (2002)

    Google Scholar 

  4. Rozenknop, A., Chappelier, J.C., Rajman, M.: Gibbsian Tree Substitution Grammars. In: Jäger, G., Monachesi, P., Wintner, S. (eds.) Proc. of Formal Grammar 2003, pp. 137–148 (2003)

    Google Scholar 

  5. Rozenknop, A.: Modèles syntaxiques probabilistes non-génératifs. Ph.D. thesis, École Polytechnique Fédérale de Lausanne, Switzerland, 3 This phenomenon has been clearly observed in experiments with the lexicalized version of the treebanks (2003)

    Google Scholar 

  6. Geman, S., Johnson, M.: Dynamic programming for parsing and estimation of Stochastic Unification-Based Grammars. In: Proc. ACL 2002, pp. 279–286 (2002)

    Google Scholar 

  7. Miyao, Y., Tsujii, J.: Maximum entropy estimation for feature forests. In: Proc. of Human Language Technology Conference, HLT 2002 (2002)

    Google Scholar 

  8. Clark, S., Curran, J.R.: Log-Linear Models for Wide-Coverage CCG Parsing. In: Proc. of EMNLP 2003, Sapporo, Japan, pp. 97–104 (2003)

    Google Scholar 

  9. Clark, S., Curran, J.R.: Parsing the WSJ using CCG and log-linear models. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), Barcelona, Spain (2004)

    Google Scholar 

  10. Chappelier, J.C., Rajman, M., Rozenknop, A.: Polynomial TSG: Characterization and new examples. In: Proc. of 7th Conf. on Formal Grammar, pp. 29–39 (2002)

    Google Scholar 

  11. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn treebank. Computational Linguistics 19, 313–330 (1994)

    Google Scholar 

  12. Lafferty, J.: Gibbs-Markov models. Computing Science and Statistics 27, 370–377 (1996)

    Google Scholar 

  13. Bod, R.: Beyond Grammar, An Experience-Based Theory of Language. CSLI Publications, Stanford (1998)

    Google Scholar 

  14. Chappelier, J.C., Rajman, M.: Polynominal TSG: an efficient framework for Data-Oriented Parsing. In: Proc. of RANLP 2001 (2001)

    Google Scholar 

  15. Dempster, M.M., Laird, N.M., Jain, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistics Society 39, 1–38 (1977)

    MATH  Google Scholar 

  16. Goodman, J.: Parsing Inside-Out. Ph.D. thesis, Harvard University (1998)

    Google Scholar 

  17. Bonnema, R., Scha, R.: Reconsidering the probability model of Data-Oriented Parsing. In: Bod, R., Scha, R., Sima’an, K. (eds.) Data-Oriented Parsing, pp. 25–41. CSLI Publications, Stanford (2002)

    Google Scholar 

  18. Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  19. Bod, R.: An efficient implementation of a new DOP model. In: Proc. EACL 2003 (2003)

    Google Scholar 

  20. Geist, A., et al.: PVM: Parallel Virtual Machine: A Users’ Guide and Tutorial for Networked Parallel Computing. MIT Press, Cambridge (1994)

    MATH  Google Scholar 

  21. Sima’an, K.: Computational complexity of probabilistic disambiguation by means of tree grammars. In: Proc. of COLING 1996 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rozenknop, A., Chappelier, JC., Rajman, M. (2004). Discriminative Models of SCFG and STSG. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30120-2_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23049-6

  • Online ISBN: 978-3-540-30120-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics