An Effective Approach of Sentence Compression Based on “Re-read” Mechanism and Bayesian Combination Model

Lu, Zhonglei; Liu, Wenfen; Zhou, Yanfang; Hu, Xuexian; Wang, Binyu

doi:10.1007/978-981-10-6805-8_11

Zhonglei Lu¹⁵,
Wenfen Liu¹⁶,
Yanfang Zhou¹⁵,
Xuexian Hu¹⁵ &
…
Binyu Wang¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 774))

Included in the following conference series:

Chinese National Conference on Social Media Processing

1785 Accesses

Abstract

As the mass social media texts and the increasingly popular “small screen” interaction mode are producing a huge text compression requirement, this paper presents an approach for English sentences compression based on a “Re-read” Mechanism and Bayesian combination model. Firstly, we build an encoder-decoder consisting of Long Short-Term Memory (LSTM) model. In the encoding stage, the original sentence’s semantics is modeled twice. The result from the first encoder, as global information, and the original sentence are input into the second encoder together, obtaining a more comprehensive semantic vector. In the decoding stage, we adopt a simple attention mechanism, focusing on the most relevant semantic information to improve the decoding efficiency. Then, a Bayesian combination model combines the explicit prior information and “Re-read” models to enhance the use of explicit training data features. Experimental results on Google Newswire sentence compression dataset show that the method proposed in this paper can greatly improve the compression accuracy, and the F1 score reaches 0.80.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: a parse-and-trim approach to headline generation. In: Proceedings of the HLT-NAACL 03 on Text Summarization Workshop, vol. 5, pp. 1–8. Association for Computational Linguistics (2003)
Google Scholar
Jing, H.: Sentence reduction for automatic text summarization. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 310–315. Association for Computational Linguistics (2000)
Google Scholar
Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artif. Intell. 139(1), 91–107 (2002)
Article MATH Google Scholar
Corston-Oliver, S.: Text compaction for display on very small screens. In: Proceedings of the NAACL Workshop on Automatic Summarization, pp. 89–98. Association for Computational Linguistics (2001)
Google Scholar
Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)
MATH Google Scholar
Filippova, K., Altun, Y.: Overcoming the lack of parallel data in sentence compression. In: EMNLP, pp. 1481–1491 (2013)
Google Scholar
Filippova, K., Alfonseca, E., Colmenares, C.A., et al.: Sentence compression by deletion with LSTMs. In: Conference on Empirical Methods in Natural Language Processing, pp. 360–368 (2015)
Google Scholar
Tran, N.T., Luong, V.T., Nguyen, N.L.T., et al.: Effective attention-based neural architectures for sentence compression with bidirectional long short-term memory. In: Proceedings of the Seventh Symposium on Information and Communication Technology, pp. 123–130. ACM (2016)
Google Scholar
Klerke, S., Goldberg, Y., Søgaard, A.: Improving sentence compression by learning to predict gaze. arXiv preprint arXiv:1604.03357 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Heidelberg (2012)
Book MATH Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult[J]. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Greff, K., Srivastava, R.K., Koutník, J., et al: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. (2016)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MATH MathSciNet Google Scholar
Raskutti, G., Wainwright, M.J., Yu, B.: Early stopping and non-parametric regression: an optimal data-dependent stopping rule. J. Mach. Learn. Res. 15(1), 335–366 (2014)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Mathematical Engineering and Advanced Computer, Zhengzhou, 450001, Henan, China
Zhonglei Lu, Yanfang Zhou, Xuexian Hu & Binyu Wang
Guangxi Key Laboratory of Cryptogpraphy and Information Security, School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, Guangxi, China
Wenfen Liu

Authors

Zhonglei Lu
View author publications
You can also search for this author in PubMed Google Scholar
Wenfen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yanfang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xuexian Hu
View author publications
You can also search for this author in PubMed Google Scholar
Binyu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenfen Liu .

Editor information

Editors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xueqi Cheng
Beijing Jinri Toutiao Technology Co. Ltd , Beijing, China
Weiying Ma
Arizona State University , Tempe, Arizona, USA
Huan Liu
Institute of Computing Technology, Chinese Academy of Sciences , Beijing, China
Huawei Shen
Renmin University of China , Beijing, China
Shizheng Feng
Microsoft Asia Research , Beijing, China
Xing Xie

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, Z., Liu, W., Zhou, Y., Hu, X., Wang, B. (2017). An Effective Approach of Sentence Compression Based on “Re-read” Mechanism and Bayesian Combination Model. In: Cheng, X., Ma, W., Liu, H., Shen, H., Feng, S., Xie, X. (eds) Social Media Processing. SMP 2017. Communications in Computer and Information Science, vol 774. Springer, Singapore. https://doi.org/10.1007/978-981-10-6805-8_11

Download citation

DOI: https://doi.org/10.1007/978-981-10-6805-8_11
Published: 26 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6804-1
Online ISBN: 978-981-10-6805-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics