Skip to main content

Searching Effective Transformer for Seq2Seq Keyphrase Generation

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13029))

  • 1566 Accesses

Abstract

Keyphrase Generation (KG) aims to generate a set of keyphrases to represent the topic information of a given document, which is a worthy task of Natural Language Processing (NLP). Recently, the Transformer structure with fully-connected self-attention blocks has been widely used in many NLP tasks due to its advantage of parallelism and global context modeling. However, in KG tasks, Transformer-based models can hardly beat the recurrent-based models. Our observations also confirm this phenomenon. Based on our observations, we state the Information Sparsity Hypothesis to explain why Transformer-based models perform poorly in KG tasks. In this paper, we conducted exhaustive experiments to confirm our hypothesis, and search for an effective Transformer model for keyphrase generation. Comprehensive experiments on multiple KG benchmarks showed that: (1) In KG tasks, uninformative content abounds in documents while salient information is diluted globally. (2) The vanilla Transformer equipped with a fully-connected self-attention mechanism may overlook the local context, leading to performance degradation. (3) We add constraints to the self-attention mechanism and introduce direction information to improve the vanilla Transformer model, which achieves state-of-the-art performance on KG benchmarks.

Y. Xu and Y. Luo—Contribute equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)

    Google Scholar 

  2. Berend, G.: Opinion expression mining by exploiting keyphrase extraction. In: IJCNLP, pp. 1162–1170, November 2011

    Google Scholar 

  3. Chan, H.P., Chen, W., Wang, L., King, I.: Neural keyphrase generation via reinforcement learning with adaptive rewards. In: ACL, pp. 2163–2174, July 2019

    Google Scholar 

  4. Chen, W., Chan, H.P., Li, P., King, I.: Exclusive hierarchical decoding for deep keyphrase generation. In: ACL, pp. 1095–1105, July 2020

    Google Scholar 

  5. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP, pp. 1724–1734, October 2014

    Google Scholar 

  6. Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q., Salakhutdinov, R.: Transformer-XL: attentive language models beyond a fixed-length context. In: ACL (2019)

    Google Scholar 

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL, June 2019

    Google Scholar 

  8. Domhan, T.: How much attention do you need? A granular analysis of neural machine translation architectures. In: ACL, pp. 1799–1808, July 2018

    Google Scholar 

  9. Gu, J., Lu, Z., Li, H., Li, V.O.: Incorporating copying mechanism in sequence-to-sequence learning. In: ACL, pp. 1631–1640, August 2016

    Google Scholar 

  10. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  11. Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: EMNLP, pp. 216–223 (2003)

    Google Scholar 

  12. Hulth, A., Megyesi, B.B.: A study on automatically extracted keywords in text categorization. In: ACL, pp. 537–544. ACL-44 (2006)

    Google Scholar 

  13. Jones, S., Staveley, M.S.: Phrasier: a system for interactive document retrieval using keyphrases. In: SIGIR, pp. 160–167 (1999)

    Google Scholar 

  14. Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: SemEval-2010 task 5: automatic keyphrase extraction from scientific articles. In: SemEval 2010, pp. 21–26 (2010)

    Google Scholar 

  15. Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: Proceedings of the First Workshop on NMT, pp. 28–39, August 2017

    Google Scholar 

  16. Krapivin, M., Autaeu, A., Marchese, M.: Large dataset for keyphrases extraction. Technical report, University of Trento (2009)

    Google Scholar 

  17. Le, T.T.N., Nguyen, M.L., Shimazu, A.: Unsupervised keyphrase extraction: introducing new kinds of words to keyphrases. In: Kang, B.H., Bai, Q. (eds.) AI 2016: Advances in Artificial Intelligence, pp. 665–671 (2016)

    Google Scholar 

  18. Lin, T., Wang, Y., Liu, X., Qiu, X.: A survey of transformers. arXiv preprint arXiv:2106.04554 (2021)

  19. Liu, Y., Lapata, M.: Text summarization with pretrained encoders. arXiv preprint arXiv:1908.08345 (2019)

  20. Luo, Y., Xu, Y., Ye, J., Qiu, X., Zhang, Q.: Keyphrase generation with fine-grained evaluation-guided reinforcement learning. arXiv preprint arXiv:2104.08799 (2021)

  21. Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., Chi, Y.: Deep keyphrase generation. In: ACL, pp. 582–592, July 2017

    Google Scholar 

  22. Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP (2004)

    Google Scholar 

  23. Nguyen, T.D., Kan, M.Y.: Keyphrase extraction in scientific publications. In: International Conference on Asian Digital Libraries, pp. 317–326 (2007)

    Google Scholar 

  24. Shaw, P., Uszkoreit, J., Vaswani, A.: Self-attention with relative position representations. In: Walker, M.A., Ji, H., Stent, A. (eds.) NAACL-HLT (2018)

    Google Scholar 

  25. Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)

    Google Scholar 

  26. Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: AAAI, pp. 855–860 (2008)

    Google Scholar 

  27. Wang, L., Cardie, C.: Domain-independent abstract generation for focused meeting summarization. In: ACL, pp. 1395–1405, August 2013

    Google Scholar 

  28. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: EMNLP, pp. 347–354, October 2005

    Google Scholar 

  29. Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries, DL 1999, New York, NY, USA, pp. 254–255 (1999)

    Google Scholar 

  30. Yan, H., Deng, B., Li, X., Qiu, X.: TENER: adapting transformer encoder for name entity recognition. arXiv preprint arXiv:1911.04474 (2019)

  31. Yuan, X., et al.: One size does not fit all: generating and evaluating variable number of keyphrases. In: ACL, pp. 7961–7975, July 2020

    Google Scholar 

  32. Zaheer, M., et al.: Big bird: transformers for longer sequences. arXiv preprint arXiv:2007.14062 (2020)

Download references

Acknowledgments

This work was supported by the National Key Research and Development Program of China (No. 2020AAA0106700) and National Natural Science Foundation of China (No. 62022027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xipeng Qiu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, Y. et al. (2021). Searching Effective Transformer for Seq2Seq Keyphrase Generation. In: Wang, L., Feng, Y., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2021. Lecture Notes in Computer Science(), vol 13029. Springer, Cham. https://doi.org/10.1007/978-3-030-88483-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88483-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88482-6

  • Online ISBN: 978-3-030-88483-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics