Abstract
The purpose of this paper is (1) to provide a theoretical justification for the use of Monte-Carlo sampling for approximate resolution of NP-hard maximization problems in the framework of weighted parsing, and (2) to show how such sampling techniques can be efficiently implemented with an explicit control of the error probability. We provide an algorithm to compute the local sampling probability distribution that guarantee that the global sampling probability indeed corresponds to the aimed theoretical score. The proposed sampling strategy significantly differs from existing methods, showing by the same way the bias induced by these methods.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R.E. Bechhofer, S. Elmaghraby, and N. Morse. A single-sample multiple-decision procedure for selecting the multinomial event which has the largest probability. Ann. Math. Statist., 30:102–119, 1959.
R.E. Bechhofer and D.M. Goldsman. Truncation of the Bechhofer-Kiefer-Sobel sequential procedure for selecting the multinomial event which has the largest probability. Communications in Statistics: simulation and computation, 14(2):283–315, 1985.
R.E. Bechhofer, J. Kiefer, and M. Sobel. Sequential Identification and Ranking Procedures. University of Chicago Press, Chicago, 1968.
R. Bod. Applying Monte Carlo techniques to Data Oriented Parsing. In Proceedings Computational Linguistics in the Netherlands, Tilburg (The Netherlands), 1992.
R. Bod. Enriching Linguistics with Statistics: Performance Models of Natural Language. Academische Pers, Amsterdam (The Netherlands), 1995.
R. Bod. Beyond Grammar, An Experience-Based Theory of Language. Number 88 in CSLI Lecture Notes. CSLI Publications, Standford (CA), 1998.
R. Bod and R. Scha. Data-Oriented language processing: An overview. Technical Report LP-96-13, Departement of Computational Linguistics, University of Amsterdam, 1996. cmp-lg/9611003.
J.-C. Chappelier and M. Rajman. Extraction stochastique d’arbres d’analyse pour le modéle DOP. In Proc. of 5éme conférence sur le Traitement Automatique du Langage Naturel (TALN98), pages 52–61, Paris (France), June 1998.
J.-C. Chappelier and M. Rajman. A generalized CYK algorithm for parsing stochastic CFG. In TAPD’98 Workshop, pages 133–137, Paris (France), 1998.
J.-C. Chappelier, M. Rajman, R. Aragues, and A. Rozenknop. Lattice parsing for speech recognition. In Proc. of 6éme conférence sur le Traitement Automatique du Langage Naturel (TALN’99), pages 95–104, July 1999.
A. Corazza, R. Demori, R. Gretter, and G. Satta. Optimal probabilistic evaluation functions for search controlled by stochastic context-free grammars. IEEE Trans, on Pattern Analysis and Machine Intelligence, 16(10):1018–1027, October 1994.
J. Goodman. Parsing Inside-Out. PhD thesis, Harvard University, May1998. cmp-lg/9805007.
F. Jelinek, J. D. Lafferty, and R. L. Mercer. Basic methods of probabilistic context-free grammars. In P. Laface and R. De Mori, editors, Speech Recognition and Understanding: Recent Advances, Trends and Applications, volume 75 of F: Computer and System Science. Springer, 1992.
H. Kesten and N. Morse. A property of the multinomial distribution. Ann. Math. Statist., 30:120–127, 1959.
W. Kuich. Semirings and formal power series: Their relevance to formal languages and automata. In G. Rozenberg and A. Salomaa, editors, Handbook of formal languages, volume 1, chapter 9, pages 609–677. Springer-Verlag, 1997.
Yves Schabes. Stochastic lexicalized tree-adjoining grammars. In Proc. 14th Int. Conf. of Computationnal Linguistics (COLING), pages 426–432, Nantes (France), August 1992.
K. Sima’an. Computational complexity of probabilistic disambiguation by means of tree grammars. In Proceedings of COLING’96, Copenhagen (Denmark), 1996. cmp-lg/9606019.
A. Stolcke. An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computational Linguistics, 21(2):165–201, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chappelier, JC., Rajman, M. (2000). Monte-Carlo Sampling for NP-Hard Maximization Problems in the Framework of Weighted Parsing. In: Christodoulakis, D.N. (eds) Natural Language Processing — NLP 2000. NLP 2000. Lecture Notes in Computer Science(), vol 1835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45154-4_10
Download citation
DOI: https://doi.org/10.1007/3-540-45154-4_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67605-8
Online ISBN: 978-3-540-45154-9
eBook Packages: Springer Book Archive