Skip to main content

Neural Paraphrase Generation with Multi-domain Corpus

  • Conference paper
  • First Online:
  • 2967 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12891))

Abstract

Automatic paraphrase generation is an important task for natural language processing. However, progress in paraphrase generation has been hindered for a long time by the lack of large monolingual parallel corpora. We can alleviate the data shortage by effectively using multi-domain corpus. In this paper, we propose a novel model to exploit information from other source domains (out-of-domains) which benefits our target domain (in-domain). In our method, we maintain a private encoder and a private decoder for each domain which are used to model domain-specific information. In the meantime, we introduce a shared encoder and a shared decoder shared by all domains which only contain domain-independent information. Besides, we add a domain discriminator to the shared encoder to reinforce the ability to capture common features of shared encoder by adversarial training. Experimental results show that our method not only perform well in traditional domain adaptation tasks but also improve performance in all domains together. Moreover, we show that the shared layer learned by our proposed model can be regarded as an off-the-shelf layer and can be easily adapted to new domains.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Britz, D., Le, Q., Pryzant, R.: Effective domain mixing for neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 118–126. Association for Computational Linguistics, Copenhagen, September 2017. https://doi.org/10.18653/v1/W17-4712. https://www.aclweb.org/anthology/W17-4712

  2. Fader, A., Zettlemoyer, L., Etzioni, O.: Paraphrase-driven learning for open question answering, vol. 1, pp. 1608–1618 (2013)

    Google Scholar 

  3. Ganin, Y., et al.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 2096-2030 (2015)

    Google Scholar 

  4. Gupta, A., Agarwal, A., Singh, P., Rai, P.: A deep generative framework for paraphrase generation (2017)

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  6. Huang, S., Yu, W., Wei, F., Ming, Z.: Dictionary-guided editing networks for paraphrase generation (2018)

    Google Scholar 

  7. Lavie, A., Agarwal, A.: METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments (2007)

    Google Scholar 

  8. Madnani, N., Tetreault, J., Chodorow, M.: Re-examining machine translation metrics for paraphrase identification. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2012)

    Google Scholar 

  9. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of Meeting of the Association for Computational Linguistics (2002)

    Google Scholar 

  10. Prakash, A., et al.: Neural paraphrase generation with stacked residual LSTM networks (2016)

    Google Scholar 

  11. Roy, A., Grangier, D.: Unsupervised paraphrasing without translation (2019)

    Google Scholar 

  12. Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In: International Conference on Machine Learning (2016)

    Google Scholar 

  13. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks (2014)

    Google Scholar 

  14. Wang, R., Finch, A., Utiyama, M., Sumita, E.: Sentence embedding for neural machine translation domain adaptation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 560–566. Association for Computational Linguistics, Vancouver, July 2017. https://doi.org/10.18653/v1/P17-2089. https://www.aclweb.org/anthology/P17-2089

  15. Wang, R., Utiyama, M., Liu, L., Chen, K., Sumita, E.: Instance weighting for neural machine translation domain adaptation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1482–1488. Association for Computational Linguistics, Copenhagen, September 2017. https://doi.org/10.18653/v1/D17-1155. https://www.aclweb.org/anthology/D17-1155

  16. Wubben, S., Van Den Bosch, A., Krahmer, E.: Paraphrase generation as monolingual translation: data and evaluation. In: International Natural Language Generation Conference (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lin Qiao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qiao, L., Li, Y., Zhong, C. (2021). Neural Paraphrase Generation with Multi-domain Corpus. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12891. Springer, Cham. https://doi.org/10.1007/978-3-030-86362-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86362-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86361-6

  • Online ISBN: 978-3-030-86362-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics