Abstract
Keyphrase Generation (KG) aims to generate a set of keyphrases to represent the topic information of a given document, which is a worthy task of Natural Language Processing (NLP). Recently, the Transformer structure with fully-connected self-attention blocks has been widely used in many NLP tasks due to its advantage of parallelism and global context modeling. However, in KG tasks, Transformer-based models can hardly beat the recurrent-based models. Our observations also confirm this phenomenon. Based on our observations, we state the Information Sparsity Hypothesis to explain why Transformer-based models perform poorly in KG tasks. In this paper, we conducted exhaustive experiments to confirm our hypothesis, and search for an effective Transformer model for keyphrase generation. Comprehensive experiments on multiple KG benchmarks showed that: (1) In KG tasks, uninformative content abounds in documents while salient information is diluted globally. (2) The vanilla Transformer equipped with a fully-connected self-attention mechanism may overlook the local context, leading to performance degradation. (3) We add constraints to the self-attention mechanism and introduce direction information to improve the vanilla Transformer model, which achieves state-of-the-art performance on KG benchmarks.
Y. Xu and Y. Luo—Contribute equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)
Berend, G.: Opinion expression mining by exploiting keyphrase extraction. In: IJCNLP, pp. 1162–1170, November 2011
Chan, H.P., Chen, W., Wang, L., King, I.: Neural keyphrase generation via reinforcement learning with adaptive rewards. In: ACL, pp. 2163–2174, July 2019
Chen, W., Chan, H.P., Li, P., King, I.: Exclusive hierarchical decoding for deep keyphrase generation. In: ACL, pp. 1095–1105, July 2020
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP, pp. 1724–1734, October 2014
Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q., Salakhutdinov, R.: Transformer-XL: attentive language models beyond a fixed-length context. In: ACL (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL, June 2019
Domhan, T.: How much attention do you need? A granular analysis of neural machine translation architectures. In: ACL, pp. 1799–1808, July 2018
Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: ACL, pp. 1631–1640, August 2016
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: EMNLP, pp. 216–223 (2003)
Hulth, A., Megyesi, B.B.: A study on automatically extracted keywords in text categorization. In: ACL, pp. 537–544. ACL-44 (2006)
Jones, S., Staveley, M.S.: Phrasier: a system for interactive document retrieval using keyphrases. In: SIGIR, pp. 160–167 (1999)
Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: SemEval-2010 task 5: automatic keyphrase extraction from scientific articles. In: SemEval 2010, pp. 21–26 (2010)
Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: Proceedings of the First Workshop on NMT, pp. 28–39, August 2017
Krapivin, M., Autaeu, A., Marchese, M.: Large dataset for keyphrases extraction. Technical report, University of Trento (2009)
Le, T.T.N., Nguyen, M.L., Shimazu, A.: Unsupervised keyphrase extraction: introducing new kinds of words to keyphrases. In: Kang, B.H., Bai, Q. (eds.) AI 2016: Advances in Artificial Intelligence, pp. 665–671 (2016)
Lin, T., Wang, Y., Liu, X., Qiu, X.: A survey of transformers. arXiv preprint arXiv:2106.04554 (2021)
Liu, Y., Lapata, M.: Text summarization with pretrained encoders. arXiv preprint arXiv:1908.08345 (2019)
Luo, Y., Xu, Y., Ye, J., Qiu, X., Zhang, Q.: Keyphrase generation with fine-grained evaluation-guided reinforcement learning. arXiv preprint arXiv:2104.08799 (2021)
Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., Chi, Y.: Deep keyphrase generation. In: ACL, pp. 582–592, July 2017
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP (2004)
Nguyen, T.D., Kan, M.Y.: Keyphrase extraction in scientific publications. In: International Conference on Asian Digital Libraries, pp. 317–326 (2007)
Shaw, P., Uszkoreit, J., Vaswani, A.: Self-attention with relative position representations. In: Walker, M.A., Ji, H., Stent, A. (eds.) NAACL-HLT (2018)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: AAAI, pp. 855–860 (2008)
Wang, L., Cardie, C.: Domain-independent abstract generation for focused meeting summarization. In: ACL, pp. 1395–1405, August 2013
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: EMNLP, pp. 347–354, October 2005
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries, DL 1999, New York, NY, USA, pp. 254–255 (1999)
Yan, H., Deng, B., Li, X., Qiu, X.: TENER: adapting transformer encoder for name entity recognition. arXiv preprint arXiv:1911.04474 (2019)
Yuan, X., et al.: One size does not fit all: generating and evaluating variable number of keyphrases. In: ACL, pp. 7961–7975, July 2020
Zaheer, M., et al.: Big bird: transformers for longer sequences. arXiv preprint arXiv:2007.14062 (2020)
Acknowledgments
This work was supported by the National Key Research and Development Program of China (No. 2020AAA0106700) and National Natural Science Foundation of China (No. 62022027).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, Y. et al. (2021). Searching Effective Transformer for Seq2Seq Keyphrase Generation. In: Wang, L., Feng, Y., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2021. Lecture Notes in Computer Science(), vol 13029. Springer, Cham. https://doi.org/10.1007/978-3-030-88483-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-88483-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88482-6
Online ISBN: 978-3-030-88483-3
eBook Packages: Computer ScienceComputer Science (R0)