Skip to main content

Hie-Transformer: A Hierarchical Hybrid Transformer for Abstractive Article Summarization

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11955))

Included in the following conference series:

Abstract

Abstractive summarization methods based on neural network models can generate more human-written and higher qualities summaries than extractive methods. However, there are three main problems for these abstractive models: inability to deal with long article inputs, out-of-vocabulary (OOV) words and repetition words in generated summaries. To tackle these problems, we proposes a hierarchical hybrid Transformer model for abstractive article summarization in this work. First, the proposed model is based on a hierarchical Transformer with selective mechanism. The Transformer has outperformed traditional sequence-to-sequence models in many natural language processing (NLP) tasks and the hierarchical structure can handle the very long article inputs. Second, the pointer-generator mechanism is applied to combine generating novel words with copying words from article inputs, which can reduce the probability of the OOV words. Additionally, we use the coverage mechanism to reduce the repetitions in summaries. The proposed model is applied to CNN-Daily Mail summarization task. The evaluation results and analyses can demonstrate that our proposed model has a competitively performance compared with the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  2. Celikyilmaz, A., Bosselut, A., He, X., Choi, Y.: Deep communicating agents for abstractive summarization. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Long Papers, vol. 1, pp. 1662–1675 (2018)

    Google Scholar 

  3. Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the NAACL: Human Language Technologies, pp. 93–98 (2016)

    Google Scholar 

  4. Gehrmann, S., Deng, Y., Rush, A.: Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4098–4109 (2018)

    Google Scholar 

  5. Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: ACL, vol. 1, pp. 1631–1640 (2016)

    Google Scholar 

  6. Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)

    Google Scholar 

  7. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)

    Google Scholar 

  8. Letarte, G., Paradis, F., Giguère, P., Laviolette, F.: Importance of self-attention for sentiment analysis. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 267–275 (2018)

    Google Scholar 

  9. Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the NAACL on Human Language Technology, vol. 1, pp. 71–78. ACL (2003)

    Google Scholar 

  10. Lin, J., Xu, S., Ma, S., Su, Q.: Global encoding for abstractive summarization. In: ACL, vol. 2, pp. 163–169 (2018)

    Google Scholar 

  11. Mani, I.: Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)

    Google Scholar 

  12. Nallapati, R., Zhou, B., dos Santos, C., Gulçehre, Ç., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: CoNLL 2016, p. 280 (2016)

    Google Scholar 

  13. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: EMNLP, pp. 379–389 (2015)

    Google Scholar 

  14. See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: ACL, vol. 1, pp. 1073–1083 (2017)

    Google Scholar 

  15. Tao, C., Gao, S., Shang, M., Wu, W., Zhao, D., Yan, R.: Get the point of my utterance! learning towards effective responses with multi-head attention mechanism. In: IJCAI, pp. 4418–4424 (2018)

    Google Scholar 

  16. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  17. Xing, C., Wu, Y., Wu, W., Huang, Y., Zhou, M.: Hierarchical recurrent attention network for response generation. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  18. Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)

    Google Scholar 

  19. Zhang, J., et al.: Improving the transformer translation model with document-level context. In: EMNLP, pp. 533–542 (2018)

    Google Scholar 

  20. Zhou, Q., Yang, N., Wei, F., Zhou, M.: Selective encoding for abstractive sentence summarization. In: ACL, Long Papers, vol. 1, pp. 1095–1104 (2017)

    Google Scholar 

Download references

Acknowledgements

This research work has been funded by the National Natural Science Foundation of China (Grant No. 61772337, U1736207), and the National Key Research and Development Program of China NO. 2018YFC0830703 and 2016QY03D0604.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gongshen Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, X., Meng, K., Liu, G. (2019). Hie-Transformer: A Hierarchical Hybrid Transformer for Abstractive Article Summarization. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Lecture Notes in Computer Science(), vol 11955. Springer, Cham. https://doi.org/10.1007/978-3-030-36718-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36718-3_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36717-6

  • Online ISBN: 978-3-030-36718-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics