Skip to main content

An Effective Approach of Sentence Compression Based on “Re-read” Mechanism and Bayesian Combination Model

  • Conference paper
  • First Online:
Social Media Processing (SMP 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 774))

Included in the following conference series:

  • 1785 Accesses

Abstract

As the mass social media texts and the increasingly popular “small screen” interaction mode are producing a huge text compression requirement, this paper presents an approach for English sentences compression based on a “Re-read” Mechanism and Bayesian combination model. Firstly, we build an encoder-decoder consisting of Long Short-Term Memory (LSTM) model. In the encoding stage, the original sentence’s semantics is modeled twice. The result from the first encoder, as global information, and the original sentence are input into the second encoder together, obtaining a more comprehensive semantic vector. In the decoding stage, we adopt a simple attention mechanism, focusing on the most relevant semantic information to improve the decoding efficiency. Then, a Bayesian combination model combines the explicit prior information and “Re-read” models to enhance the use of explicit training data features. Experimental results on Google Newswire sentence compression dataset show that the method proposed in this paper can greatly improve the compression accuracy, and the F1 score reaches 0.80.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.nltk.org/.

  2. 2.

    http://code.google.com/p/word2vec/.

References

  1. Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: a parse-and-trim approach to headline generation. In: Proceedings of the HLT-NAACL 03 on Text Summarization Workshop, vol. 5, pp. 1–8. Association for Computational Linguistics (2003)

    Google Scholar 

  2. Jing, H.: Sentence reduction for automatic text summarization. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 310–315. Association for Computational Linguistics (2000)

    Google Scholar 

  3. Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artif. Intell. 139(1), 91–107 (2002)

    Article  MATH  Google Scholar 

  4. Corston-Oliver, S.: Text compaction for display on very small screens. In: Proceedings of the NAACL Workshop on Automatic Summarization, pp. 89–98. Association for Computational Linguistics (2001)

    Google Scholar 

  5. Clarke, J., Lapata, M.: Global inference for sentence compression: an integer linear programming approach. J. Artif. Intell. Res. 31, 399–429 (2008)

    MATH  Google Scholar 

  6. Filippova, K., Altun, Y.: Overcoming the lack of parallel data in sentence compression. In: EMNLP, pp. 1481–1491 (2013)

    Google Scholar 

  7. Filippova, K., Alfonseca, E., Colmenares, C.A., et al.: Sentence compression by deletion with LSTMs. In: Conference on Empirical Methods in Natural Language Processing, pp. 360–368 (2015)

    Google Scholar 

  8. Tran, N.T., Luong, V.T., Nguyen, N.L.T., et al.: Effective attention-based neural architectures for sentence compression with bidirectional long short-term memory. In: Proceedings of the Seventh Symposium on Information and Communication Technology, pp. 123–130. ACM (2016)

    Google Scholar 

  9. Klerke, S., Goldberg, Y., Søgaard, A.: Improving sentence compression by learning to predict gaze. arXiv preprint arXiv:1604.03357 (2016)

  10. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  11. Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Heidelberg (2012)

    Book  MATH  Google Scholar 

  12. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  13. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult[J]. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)

    Article  Google Scholar 

  14. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  15. Greff, K., Srivastava, R.K., Koutník, J., et al: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. (2016)

    Google Scholar 

  16. Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MATH  MathSciNet  Google Scholar 

  17. Raskutti, G., Wainwright, M.J., Yu, B.: Early stopping and non-parametric regression: an optimal data-dependent stopping rule. J. Mach. Learn. Res. 15(1), 335–366 (2014)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenfen Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Lu, Z., Liu, W., Zhou, Y., Hu, X., Wang, B. (2017). An Effective Approach of Sentence Compression Based on “Re-read” Mechanism and Bayesian Combination Model. In: Cheng, X., Ma, W., Liu, H., Shen, H., Feng, S., Xie, X. (eds) Social Media Processing. SMP 2017. Communications in Computer and Information Science, vol 774. Springer, Singapore. https://doi.org/10.1007/978-981-10-6805-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6805-8_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6804-1

  • Online ISBN: 978-981-10-6805-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics