Abstract
As the mass social media texts and the increasingly popular “small screen” interaction mode are producing a huge text compression requirement, this paper presents an approach for English sentences compression based on a “Re-read” Mechanism and Bayesian combination model. Firstly, we build an encoder-decoder consisting of Long Short-Term Memory (LSTM) model. In the encoding stage, the original sentence’s semantics is modeled twice. The result from the first encoder, as global information, and the original sentence are input into the second encoder together, obtaining a more comprehensive semantic vector. In the decoding stage, we adopt a simple attention mechanism, focusing on the most relevant semantic information to improve the decoding efficiency. Then, a Bayesian combination model combines the explicit prior information and “Re-read” models to enhance the use of explicit training data features. Experimental results on Google Newswire sentence compression dataset show that the method proposed in this paper can greatly improve the compression accuracy, and the F1 score reaches 0.80.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: a parse-and-trim approach to headline generation. In: Proceedings of the HLT-NAACL 03 on Text Summarization Workshop, vol. 5, pp. 1–8. Association for Computational Linguistics (2003)
Jing, H.: Sentence reduction for automatic text summarization. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 310–315. Association for Computational Linguistics (2000)
Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artif. Intell. 139(1), 91–107 (2002)
Corston-Oliver, S.: Text compaction for display on very small screens. In: Proceedings of the NAACL Workshop on Automatic Summarization, pp. 89–98. Association for Computational Linguistics (2001)
Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)
Filippova, K., Altun, Y.: Overcoming the lack of parallel data in sentence compression. In: EMNLP, pp. 1481–1491 (2013)
Filippova, K., Alfonseca, E., Colmenares, C.A., et al.: Sentence compression by deletion with LSTMs. In: Conference on Empirical Methods in Natural Language Processing, pp. 360–368 (2015)
Tran, N.T., Luong, V.T., Nguyen, N.L.T., et al.: Effective attention-based neural architectures for sentence compression with bidirectional long short-term memory. In: Proceedings of the Seventh Symposium on Information and Communication Technology, pp. 123–130. ACM (2016)
Klerke, S., Goldberg, Y., Søgaard, A.: Improving sentence compression by learning to predict gaze. arXiv preprint arXiv:1604.03357 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Heidelberg (2012)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult[J]. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Greff, K., Srivastava, R.K., Koutník, J., et al: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. (2016)
Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Raskutti, G., Wainwright, M.J., Yu, B.: Early stopping and non-parametric regression: an optimal data-dependent stopping rule. J. Mach. Learn. Res. 15(1), 335–366 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lu, Z., Liu, W., Zhou, Y., Hu, X., Wang, B. (2017). An Effective Approach of Sentence Compression Based on “Re-read” Mechanism and Bayesian Combination Model. In: Cheng, X., Ma, W., Liu, H., Shen, H., Feng, S., Xie, X. (eds) Social Media Processing. SMP 2017. Communications in Computer and Information Science, vol 774. Springer, Singapore. https://doi.org/10.1007/978-981-10-6805-8_11
Download citation
DOI: https://doi.org/10.1007/978-981-10-6805-8_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6804-1
Online ISBN: 978-981-10-6805-8
eBook Packages: Computer ScienceComputer Science (R0)