A Normalized Encoder-Decoder Model for Abstractive Summarization Using Focal Loss

Shi, Yunsheng; Meng, Jun; Wang, Jian; Lin, Hongfei; Li, Yumeng

doi:10.1007/978-3-319-99501-4_34

A Normalized Encoder-Decoder Model for Abstractive Summarization Using Focal Loss

Yunsheng Shi¹⁸,
Jun Meng¹⁸,
Jian Wang¹⁸,
Hongfei Lin¹⁸ &
…
Yumeng Li¹⁸

Conference paper
First Online: 14 August 2018

2125 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11109))

Abstract

Abstractive summarization based on seq2seq model is a popular research topic today. And pre-trained word embedding is a common unsupervised method to improve deep learning model’s performance in NLP. However, during applying this method directly to the seq2seq model, we find it does not achieve the same good result as other fields because of an over training problem. In this paper, we propose a normalized encoder-decoder structure to address it, which can prevent the semantic structure of pre-trained word embedding from being destroyed during training. Moreover, we use a novel focal loss function to help our model focus on those examples with low score for getting better performance. We conduct the experiments on NLPCC2018 share task 3: single document summary. Result showed that these two mechanisms are extremely useful, helping our model achieve state-of-the-art ROUGE scores and get the first place in this task from the current rankings.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Neural Information Processing Systems (2014)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representation (2014)
Google Scholar
Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379–389 (2015)
Google Scholar
Nallapati, R., Zhou, B., dos Santos, C., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 280–290 (2016)
Google Scholar
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1073–1083 (2017)
Google Scholar
Lin, T.-Y., Goyal, P., Girshick, R., He, K.: Focal loss for dense object detection (2018). arXiv preprint: arXiv:1708.02002
Luong, M.-T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Empirical Methods on Natural Language Processing (2015)
Google Scholar
Hermann, K.M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., Blunsom, P.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems, pp. 1693–1701 (2015)
Google Scholar
Graff, D., Kong, J., Chen, K., Maeda, K.: English gigaword. Linguistic Data Consortium, Philadelphia (2003)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. CoRR, abs/1310.4546 (2013)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Learning Representation (2015)
Google Scholar
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: ACL Workshop (2004b)
Google Scholar

Download references

Acknowledgments

This research is supported by the National Key Research Development Program of China (No. 2016YFB1001103).

Author information

Authors and Affiliations

Dalian University of Technology, Dalian, 116023, Liaoning, China
Yunsheng Shi, Jun Meng, Jian Wang, Hongfei Lin & Yumeng Li

Authors

Yunsheng Shi
View author publications
You can also search for this author in PubMed Google Scholar
Jun Meng
View author publications
You can also search for this author in PubMed Google Scholar
Jian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongfei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yumeng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Wang .

Editor information

Editors and Affiliations

Soochow University, Suzhou, China
Min Zhang
The University of Texas at Dallas, Richardson, Texas, USA
Vincent Ng
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, Y., Meng, J., Wang, J., Lin, H., Li, Y. (2018). A Normalized Encoder-Decoder Model for Abstractive Summarization Using Focal Loss. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2018. Lecture Notes in Computer Science(), vol 11109. Springer, Cham. https://doi.org/10.1007/978-3-319-99501-4_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-99501-4_34
Published: 14 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99500-7
Online ISBN: 978-3-319-99501-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)