Abstract
Syntactic parsing is one of the central tasks in Natural Language Processing. In this paper, a multilevel syntactic parsing algorithm is proposed, which is a three-level model with innovative combinations of existing mature tools and algorithms. First, coarse-grained syntax trees are generated with general algorithms, such as Cocke-Younger-Kasami (CYK) algorithm based on Probabilistic Context Free Grammar (PCFG). Second, Recursive Restricted Boltzmann Machines (RRBM) are constructed, which aim at extracting feature vector through training syntax trees with deep learning methods. At last, Learning to Rank (LTR) model is trained to get the most satisfactory syntax tree and furthermore turn the parsing problem into a typical retrieval problem. Experiment results show that our method has achieved the state-of-the-art performance on syntactic parsing task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of the 24th International Conference on Machine Learning, pp. 641–648. ACM (2007)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Hinton, G.E., Osindero, S., The, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: International Conference on Artificial Intelligence and Statistics, pp. 448–455 (2009)
Krizhevsky, A., Hinton, G.E.: Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Technical report 1 (4), 7 (2009)
Krizhevsky, A., Hinton, G.E.: Using very deep autoencoders for content-based image retrieval. In: ESANN. Citeseer (2011)
Mohamed, A., Dahl, G.E., Hinton, G.: Acoustic modeling using deep belief net-works. IEEE Trans. Audio Speech Lang. Process. 20(1), 14–22 (2012)
Hinton, G.E., Deng, L., Yu, D., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Sig. Process. Mag. IEEE 29(6), 82–97 (2012)
Salakhutdinov, R., Hinton, G.E.: Semantic hashing. Int. J. Approximate Reasoning 50(7), 969–978 (2009)
Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 2003(3), 1137–1155 (2003)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representa-tions of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Hinton, G.E.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pp. 1–12 (1986)
Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)
Huang, E.H., Socher, R., Manning, C.D., et al.: Improving word representations via global con-text and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 873–882. Association for Computational Linguistics (2012)
Socher, R., Bauer, J., Manning, C.D., Ng, A.Y.: Parsing with compositional vector grammars. In: Proceedings of the ACL Conference (2013)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Zhai, C.X.: Statistical language models for information retrieval. Synth. Lect. Hum. Lang. Technol. 1(1), 1–141 (2008)
Xia, F., Liu, T.Y., Wang, J., et al.: Listwise approach to learning to rank: theory and algorithm. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1192–1199. ACM (2008)
Gildea, D., Palmer, M.: The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 239–246. Association for Computational Linguistics, Stroudsburg (2002)
Klein, D., Manning C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics, pp. 423–430 (2003)
Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 433–440. Association for Computational Linguistics (2006)
Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of the 1st North American chapter of the Association for Computational Linguistics Conference, pp. 132–139. Association for Computational Linguistics (2000)
Collins, M.: Head-driven statistical models for natural language parsing. Comput. Linguist. 29(4), 589–637 (2003)
Collins, M., Koo, T.: Discriminative reranking for natural language parsing. Comput. Linguist. 31(1), 25–70 (2005)
Younger, D.H.: Recognition and parsing of context-free languages in time n 3. Inform. Control 10(2), 189 (1967)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Abney, S., Flickenger, S., Gdaniec, C., et al.: Procedure for quantitatively comparing the syntac-tic coverage of English grammars. In: Proceedings of the Workshop on Speech and Natural Language, pp. 306–311. Association for Computational Linguistics (1991)
Freund, Y., Iyer, R., Schapire, R.E., et al.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 2003(4), 933–969 (2003)
Burges, C.J.C.: From ranknet to lambdarank to lambdamart: an overview. Learning 2010(11), 23–581 (2010)
Xu, J., Li, H.: AdaRank: a boosting algorithm for information retrieval. In: Proceedings of the 30rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 391–398. ACM (2007)
Acknowledgements
This work is supported in part by the National Natural Science Foundation of China under Grant no. 61372171.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, J., Chen, H., Zhou, S., He, B. (2016). Multilevel Syntactic Parsing Based on Recursive Restricted Boltzmann Machines and Learning to Rank. In: Chau, M., Wang, G., Chen, H. (eds) Intelligence and Security Informatics. PAISI 2016. Lecture Notes in Computer Science(), vol 9650. Springer, Cham. https://doi.org/10.1007/978-3-319-31863-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-31863-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31862-2
Online ISBN: 978-3-319-31863-9
eBook Packages: Computer ScienceComputer Science (R0)