Skip to main content
Log in

Non-parallel text style transfer with domain adaptation and an attention model

  • Published:
Applied Intelligence Aims and scope Submit manuscript

A Correction to this article was published on 10 July 2021

This article has been updated

Abstract

Text style transfer, the aim of which is to convert a specific style in a given sentence to another target style while maintaining the style-independent content information of the original sentence, can face challenges when applied to non-parallel text. In this paper, we combine domain adaptation learning and an attention model to propose a new framework to accomplish the task. Domain adaptation can leverage relative information from the source domain to improve the generative model’s capacity for reconstructing data. The attention model can give the importance weights of generated words for the target style in a sentence; therefore, the generative model can concentrate on generating words with higher importance weights to accomplish text style transfer effectively. We evaluate our framework using Yelp, Amazon and Captions corpora. The results of automatic and human evaluation demonstrate the effectiveness of our framework compared with previous works under non-parallel and limited training data. The available codes are in https://github.com/mingxuan007/text-style-transfer-with-adversarial-network-and-domain-adaptation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Change history

References

  1. Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. In: International Conference on Learning Representations

  2. Bahdanau, D, Cho, K, Bengio, Y (2014) Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations

  3. Chen, X, Duan, Y, Houthooft, R, Schulman, J, Sutskever, I, Abbeel, P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, pp 2180–2188

  4. Cho, K, Van Merrienboer, B, Gulcehre, C, Bahdanau, D, Bougares, F, Schwenk, H, Bengio, Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Empirical methods in natural language processing, pp 1724–1734

  5. Dong, H, Neekhara, P, Wu, C, Guo, Y (2017) Unsupervised image-to-image translation with generative adversarial networks. In: IEEE Conference on computer vision and pattern recognition

  6. Fu, Z, Tan, X, Peng, N, Zhao, D, Yan, R (2018) Style transfer in text: Exploration and evaluation. In: AAAI Conference on Artificial Intelligence

  7. Gan, C, Gan, Z, He, X, Gao, J, Deng, L (2017) Stylenet: Generating attractive visual captions with styles. In: IEEE Conference on computer vision and pattern recognition, pp 955–964

  8. Gatys, LA, Ecker, AS, Bethge, M (2016) Image style transfer using convolutional neural networks. In: IEEE Conference on computer vision and pattern recognition, pp 2414–2423

  9. Gezici, G, Yanikoglu, B, Tapucu, D, Saygın, Y (2015) Sentiment analysis using domain-adaptation and sentence-based analysis. In: Advances in Social Media Analysis, Springer, pp 45–64

  10. Goodfellow, I, Pouget-Abadie, J, Mirza, M, Xu, B, Warde-Farley, D, Ozair, S, Courville, A, Bengio, Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  11. Hu, Z, Yang, Z, Liang, X, Salakhutdinov, R, XingvEP (2017) Toward controlled generation of text. In: International Conference on Machine Learning, pp 1587–1596

  12. Jang, E, Gu, S, Poole, B (2016) Categorical reparameterization with gumbel-softmax. arXiv:16110114

  13. John, V, Mou, L, Bahuleyan, H, Vechtomova, O (2019) Disentangled representation learning for non-parallel text style transfer. In: Association for Computational Linguistics, pp 424–434

  14. Kim, Y (2014) Convolutional neural networks for sentence classification. In: Empirical methods in natural language processing, pp 1746–1751

  15. Kingma, D, Ba, J (2014) Adam: A method for stochastic optimization. In: International Conference on Learning Representations

  16. Kingma, DP, Welling, M (2014) Auto-encoding variational bayes. In: International Conference on Learning Representations

  17. Kullback, S, Leibler, RA (1951) On information and sufficiency. The annals of mathematical statistics 22(1):79–86

  18. Kusner, MJ, Hernandezlobato, JM (2016) Gans for sequences of discrete elements with the gumbel-softmax distribution. arXiv preprint arXiv:161104051

  19. Lamb, AM, Goyal, AGAP, Zhang, Y, Zhang, S, Courville, AC, Bengio, Y (2016) Professor forcing: A new algorithm for training recurrent networks. In: Advances in neural Information Processing Systems, pp 4601–4609

  20. Li, D, Zhang, Y, Gan, Z, Cheng, Y, Brockett, C, Dolan, B, Sun, M (2019) Domain adaptive text style transfer. In: International joint conference on natural language processing, pp 3302–3311

  21. Li, J, Monroe, WS, Shi, T, Jean, S, Ritter, A, Jurafsky, D (2017) Adversarial learning for neural dialogue generation. arXiv preprint arXiv:170106547

  22. Li, J, Jia, R, He, H, Liang, P (2018) Delete, retrieve, generate: A simple approach to sentiment and style transfer. In: North american chapter of the association for computational linguistics, vol 1, pp 1865–1874

  23. Liu, M, Breuel, TM, Kautz, J (2017) Unsupervised image-to-image translation networks. In: IEEE Conference on computer vision and pattern recognition

  24. Papineni, K, Roukos, S, Ward, T, Zhu, WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Association for Computational Linguistics, Association for Computational Linguistics, pp 311–318

  25. Peng, N, Dredze, M (2016) Multi-task domain adaptation for sequence tagging. arXiv:160802689

  26. Pennington, J, Socher, R, Manning, CD (2014) Glove: Global vectors for word representation. In: Empirical methods in natural language processing, pp 1532–1543

  27. Shen, T, Lei, T, Barzilay, R, Jaakkola, T (2017) Style transfer from non-parallel text by cross-alignment. In: Advances in neural information processing systems, pp 6830–6841

  28. Sutton, RS, McAllester, DA, Singh, SP, Mansour, Y (2000) Policy gradient methods for reinforcement learning with function approximation. In: Advances in neural information processing systems, pp 1057–1063

  29. Wen, T, Gasic, M, Mrksic, N, Su, P, Vandyke, D, Young, S (2015) Semantically conditioned lstm-based natural language generation for spoken dialogue systems. In: Empirical methods in natural language processing, pp 1711–1721

  30. Wen, T, Gasic, M, Mrksic, N, Rojasbarahona, LM, Su, P, Vandyke, D, Young, S (2016) Multi-domain neural network language generation for spoken dialogue systems. arXiv:160301232

  31. Wu, Y, Schuster, M, Chen, Z, Le, QV, Norouzi, M, Macherey, W, Krikun, M, Cao, Y, Gao, Q, Macherey, K, et al (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:160908144

  32. Xu, J, Sun, X, Zeng, Q, Zhang, X, Ren, X, Wang, H, Li, W (2018) Unpaired sentiment-to-sentiment translation: A cycled reinforcement learning approach. In: Association for Computational Linguistics, vol 1, pp 979–988

  33. Yang, Z, Yang, D, Dyer, C, He, X, Smola, AJ, Hovy, E (2016) Hierarchical attention networks for document classification. In: North american chapter of the association for computational linguistics, pp 1480–1489

  34. Yi, Z, Zhang, H, Tan, P, Gong, M (2017) Dualgan: Unsupervised dual learning for image-to-image translation. In: IEEE Conference on computer vision and pattern recognition

  35. Yu, L, Zhang, W, Wang, J, Yu, Y (2017) Seqgan: Sequence generative adversarial nets with policy gradient. In: AAAI Conference on Artificial Intelligence, pp 2852–2858

  36. Zhu, JY, Park, T, Isola, P, Efros, AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: International Conference on Computer Vision, pp 2242–2251

Download references

Acknowledgements

The authors would like to appreciate the editors and the reviewers for their constructive suggestions. And the creation of the paper cannot leave with the excellent papers on deep learning and their codes.

This work is supported by the Science and Technology Plan of Yunnan Province of China under grants 2014AB016.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min He.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: there are errors in some of the equations and data in table 3.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, M., He, M. Non-parallel text style transfer with domain adaptation and an attention model. Appl Intell 51, 4609–4622 (2021). https://doi.org/10.1007/s10489-020-02077-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02077-5

Keywords

Navigation