Best parse parsing with Earley's and Inside algorithms on probabilistic RTN

Young S. Han; Key-Sun Choi

doi:10.1017/S1351324900000127

Best parse parsing with Earley's and Inside algorithms on probabilistic RTN

Published online by Cambridge University Press: 12 September 2008

Young S. Han and

Key-Sun Choi

Show author details

Young S. Han: Affiliation:
Korea Advanced Institute of Science and TechnologyTaejon, Korea
Key-Sun Choi: Affiliation:
Korea Advanced Institute of Science and TechnologyTaejon, Korea

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Inside parsing is a best parse parsing method based on the Inside algorithm that is often used in estimating probabilistic parameters of stochastic context free grammars. It gives a best parse in O(N3G3) time where N is the input size and G is the grammar size. Earley algorithm can be made to return best parses with the same complexity in N.

By way of experiments, we show that Inside parsing can be more efficient than Earley parsing with sufficiently large grammar and sufficiently short input sentences. For instance, Inside parsing is better with sentences of 16 or less words for a grammar containing 429 states. In practice, parsing can be made efficient by employing the two methods selectively.

The redundancy of Inside algorithm can be reduced by the topdown filtering using the chart produced by Earley algorithm, which is useful in training the probabilistic parameters of a grammar. Extensive experiments on Penn Tree corpus show that the efficiency of Inside computation can be improved by up to 55%.

Type: Articles
Information: Natural Language Engineering , Volume 1 , Issue 2 , June 1995 , pp. 147 - 161

DOI: https://doi.org/10.1017/S1351324900000127 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1995

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aho, Alfred V., and Ullman, Jeffrey D., (1972) The Theory of Parsing, Translation, and Compiling, vol. I. New Jersey: Prentice Hall.Google Scholar

Allen, J., (1994) Natural Language Understanding. 2nd edition. Benjamin Cummings.Google Scholar

Briscoe, T., and Carroll, J., (1993) Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars. Computational Linguistics 19(1): 25–57.Google Scholar

Carroll, J., and Briscoe, T., (1992) Probabilistic normalization and unpacking of packed parse forests for unification-based grammars. In proceedings, AAAl Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 33–8.Google Scholar

Charniak, E., and Goldman, R., (1993) A Bayesian model of plan recognition. Artificial Intelligence. 64(1): 53–79.Google Scholar

Charniak, E., Hendrickson, C., Jacobson, N., and Perkowitz, M., (1993) Equations for part-of-speech tagging. In proceedings, AAAl Conference.Google Scholar

Glenn, C, and Charniak, E., (1992) Learning probabilistic dependency grammars from labelled texts. In Proceedings, AAAl Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 25–32.Google Scholar

Han, Young S., and Choi, Key-Sun. (1993) Lexical concept acquisition from collocation map. In Proceedings, a workshop of SIGLEX: Acquisition of Lexical Knowledge from Text.Ohio. Pp. 22–31.Google Scholar

Han, Young S., and Choi, Key-Sun. (1994) A Reestimation algorithm for probabilistic transition network. In proceedings of COLING.Kyoto. Pp. 859–64.CrossRef Google Scholar

Jelinek, R, Lafferty, J. D., and Mercer, R. L., (1990) Basic methods of probabilistic context free grammars. IBM RC 16374. IBM Continuous Speech Recognition Group.Google Scholar

Kochut, K., (1983) Towards the elastic ATN implementation. In Leonard, B., (ed.), The Design of Interpreters, Compilers, and Editors for ATN. New York: Springer-Verlag. Pp. 175–214.Google Scholar

Kupiec, J., (1991) A Trellis-based algorithm for estimating the parameters of a hidden stochastic context-free grammar. In Proceedings, Speech and Natural Language Workshop,sponsored by DARPA.Pacific Grove. Pp. 241–6.CrossRef Google Scholar

Lafferty, J., Sleator, D., and Temperley, D., (1992) Grammatical trigrams: a probabilistic model of link grammar. In Proceedings, AAAI Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 89–97.Google Scholar

Lari, K., and Young, S. J., (1990) The estimation of stochastic context-free grammars using the Inside-Outside algorithm. Computer Speech and Language. 4: 35–56.Google Scholar

Schabes, Y., (1992) Stochastic lexicalized tree-adjoining grammars. In Proceedings, the 15th International Conference on Computational Linguistics.Google Scholar

Woods, W. A., (1970) Transition network grammars for natural language analysis, Communication of the ACM 13.CrossRef Google Scholar

Wright, J. H., (1990) LR parsing of probabilistic grammars with input uncertainty for speech recognition. Computer Speech and Language 4:297–323.CrossRef Google Scholar

Wright, J., Wrighley, E., and Sharman, R., (1991) Adaptive probabilistic generalized LR parsing. In Proceedings, 2nd International Workshop on Parsing Technologies,Cancun, Mexico. Pp. 154–63.Google Scholar

Article contents

Best parse parsing with Earley's and Inside algorithms on probabilistic RTN

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests