Hie-Transformer: A Hierarchical Hybrid Transformer for Abstractive Article Summarization

Zhang, Xuewen; Meng, Kui; Liu, Gongshen

doi:10.1007/978-3-030-36718-3_21

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11955))

Included in the following conference series:

International Conference on Neural Information Processing

2809 Accesses
1 Citations

Abstract

Abstractive summarization methods based on neural network models can generate more human-written and higher qualities summaries than extractive methods. However, there are three main problems for these abstractive models: inability to deal with long article inputs, out-of-vocabulary (OOV) words and repetition words in generated summaries. To tackle these problems, we proposes a hierarchical hybrid Transformer model for abstractive article summarization in this work. First, the proposed model is based on a hierarchical Transformer with selective mechanism. The Transformer has outperformed traditional sequence-to-sequence models in many natural language processing (NLP) tasks and the hierarchical structure can handle the very long article inputs. Second, the pointer-generator mechanism is applied to combine generating novel words with copying words from article inputs, which can reduce the probability of the OOV words. Additionally, we use the coverage mechanism to reduce the repetitions in summaries. The proposed model is applied to CNN-Daily Mail summarization task. The evaluation results and analyses can demonstrate that our proposed model has a competitively performance compared with the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Celikyilmaz, A., Bosselut, A., He, X., Choi, Y.: Deep communicating agents for abstractive summarization. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Long Papers, vol. 1, pp. 1662–1675 (2018)
Google Scholar
Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the NAACL: Human Language Technologies, pp. 93–98 (2016)
Google Scholar
Gehrmann, S., Deng, Y., Rush, A.: Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4098–4109 (2018)
Google Scholar
Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: ACL, vol. 1, pp. 1631–1640 (2016)
Google Scholar
Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)
Google Scholar
Letarte, G., Paradis, F., Giguère, P., Laviolette, F.: Importance of self-attention for sentiment analysis. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 267–275 (2018)
Google Scholar
Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the NAACL on Human Language Technology, vol. 1, pp. 71–78. ACL (2003)
Google Scholar
Lin, J., Xu, S., Ma, S., Su, Q.: Global encoding for abstractive summarization. In: ACL, vol. 2, pp. 163–169 (2018)
Google Scholar
Mani, I.: Advances in Automatic Text Summarization. MIT Press, Cambridge (1999)
Google Scholar
Nallapati, R., Zhou, B., dos Santos, C., Gulçehre, Ç., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: CoNLL 2016, p. 280 (2016)
Google Scholar
Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: EMNLP, pp. 379–389 (2015)
Google Scholar
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: ACL, vol. 1, pp. 1073–1083 (2017)
Google Scholar
Tao, C., Gao, S., Shang, M., Wu, W., Zhao, D., Yan, R.: Get the point of my utterance! learning towards effective responses with multi-head attention mechanism. In: IJCAI, pp. 4418–4424 (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Xing, C., Wu, Y., Wu, W., Huang, Y., Zhou, M.: Hierarchical recurrent attention network for response generation. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar
Zhang, J., et al.: Improving the transformer translation model with document-level context. In: EMNLP, pp. 533–542 (2018)
Google Scholar
Zhou, Q., Yang, N., Wei, F., Zhou, M.: Selective encoding for abstractive sentence summarization. In: ACL, Long Papers, vol. 1, pp. 1095–1104 (2017)
Google Scholar

Download references

Acknowledgements

This research work has been funded by the National Natural Science Foundation of China (Grant No. 61772337, U1736207), and the National Key Research and Development Program of China NO. 2018YFC0830703 and 2016QY03D0604.

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, 200240, China
Xuewen Zhang, Kui Meng & Gongshen Liu

Authors

Xuewen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kui Meng
View author publications
You can also search for this author in PubMed Google Scholar
Gongshen Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gongshen Liu .

Editor information

Editors and Affiliations

Australian National University, Canberra, ACT, Australia
Tom Gedeon
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Meng, K., Liu, G. (2019). Hie-Transformer: A Hierarchical Hybrid Transformer for Abstractive Article Summarization. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Lecture Notes in Computer Science(), vol 11955. Springer, Cham. https://doi.org/10.1007/978-3-030-36718-3_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-36718-3_21
Published: 09 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36717-6
Online ISBN: 978-3-030-36718-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics