skip to main content
10.1145/3308558.3313734acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

BoFGAN: Towards A New Structure of Backward-or-Forward Generative Adversarial Nets

Published: 13 May 2019 Publication History

Abstract

Natural Language Generation (NLG), as an important part of Natural Language Processing (NLP), has begun to take full advantage of recent advances in language models. Based on recurrent neural networks (RNNs), NLG has made ground breaking improvement and is widely applied in many tasks. RNNs typically learn a joint probability of words, and the additional information is usually fed to RNNs hidden layer using implicit vector representations. Still, there exists some problem unsolved. Standard RNN is not applicable when we need to impose hard constraints on the language generation tasks: for example, standard RNNs cannot guarantee designated word(s) to appear in a target sentence to generate. In this paper, we propose a Backward-or-Forward Generative Adversarial Nets model (BoFGAN) to address this problem. Starting from a particular given word, a generative model at every time step generates a new preceding or subsequent word conditioned on the generated sequence so far until both sides reach an end. To train the generator, we first model it as a stochastic policy using Reinforcement Learning; then we employ a discriminator to evaluate the quality of a complete sequence as the end reward; and lastly, we apply Monte Carlo (MC) search to estimate the long-term return and update the generator via policy gradient. Experimental results demonstrate the effectiveness and rationality of our proposed BoFGAN model.

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473 (2014).
[2]
Srinivas Bangalore and Owen Rambow. 2000. Corpus-based lexical choice in natural language generation. In ACL. 464-471.
[3]
John A. Bateman. 1997. Enabling technology for multilingual natural language generation: the KPML development environment.Natural Language Engineering3, 1 (1997), 15-55.
[4]
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv:1409.1259 (2014).
[5]
Emily Denton, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. In NIPS. 1486-1494.
[6]
Zhenxin Fu, Xiaoye Tan, Nanyun Peng, Dongyan Zhao, and Rui Yan. 2018. Style transfer in text: Exploration and evaluation. In Thirty-Second AAAI Conference on Artificial Intelligence.
[7]
Shen Gao, Xiuying Chen, Piji Li, Zhaochun Ren, Lidong Bing, Dongyan Zhao, and Rui Yan. 2019. Abstractive Text Summarization by Incorporating Reader Comments. In AAAI'19.
[8]
Merrill F Garrett. 1980. Levels of processing in sentence production. Language production I(1980), 177-220.
[9]
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. In NIPS. 2672-2680.
[10]
Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks18, 5 (2005), 602-610.
[11]
Daniel Jiwoong Im, Chris Dongjoo Kim, Hui Jiang, and Roland Memisevic. 2016. Generating images with recurrent adversarial networks. arXiv:1602.05110 (2016).
[12]
Rie Johnson and Tong Zhang. 2014. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. arXiv:1412.1058 (2014).
[13]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. arXiv:1408.5882 (2014).
[14]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 (2014).
[15]
Ji Young Lee and Franck Dernoncourt. 2016. Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks. In HLT-NAACL. 515-520.
[16]
Juntao Li, Lidong Bing, Lisong Qiu, Dongmin Chen, Dongyan Zhao, and Rui Yan. 2019. Learning to Write Creative Stories with Thematic Consistency. In AAAI'19.
[17]
Juntao Li, Yan Song, Haisong Zhang, Dongmin Chen, Shuming Shi, Dongyan Zhao, and Rui Yan. 2018. Generating Classical Chinese Poems via Conditional Variational Autoencoder and Adversarial Training. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 3890-3900.
[18]
Thang Luong, Richard Socher, and Christopher D Manning. 2013. Better word representations with recursive neural networks for morphology. In CoNLL. 104-113.
[19]
S. W Mcroy. 2003. An augmented template-based approach to text realization. Natural Language Engineering9, 4 (2003), 381-420.
[20]
Ning Miao, Hao Zhou, Lili Mou, Rui Yan, and Lei Li. 2019. CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling. In AAAI'19.
[21]
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernocký, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In INTERSPEECH. 1045-1048.
[22]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS. 3111-3119.
[23]
Lili Mou, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. 2016. Backward and Forward Language Modeling for Constrained Sentence Generation. Computer Science4, 6 (2016), 473-482.
[24]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In ACL. 311-318.
[25]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In EMNLP. 1532-1543.
[26]
Adwait Ratnaparkhi. 2000. Trainable Methods for Surface Natural Language Generation. arXiv:cs.CL/0006028 (2000).
[27]
Ehud Reiter, Chris Mellish, and Jon Levine. 1995. Automatic Generation of Technical Documentation. Journal of Applied Artificial Intelligence(1995).
[28]
Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A Neural Attention Model for Abstractive Sentence Summarization. In EMNLP. 379-389.
[29]
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing(1997).
[30]
C. Shaoul and Westbury C. 2010. The Westbury Lab Wikipedia Corpus, Edmonton, AB: University of Alberta.
[31]
Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. 2015. Highway Networks. arXiv:1505.00387 (2015).
[32]
Ilya Sutskever, James Martens, and Geoffrey E. Hinton. 2011. Generating Text with Recurrent Neural Networks. In ICML. 1017-1024.
[33]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In NIPS. 3104-3112.
[34]
R. Sutton, D. McAllester, S. Singh, and Y. Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In NIPS. 1057-1063.
[35]
Rui Yan. 2016. i, Poet: Automatic Poetry Composition through Recurrent Neural Networks with Iterative Polishing Schema. In IJCAI. 2238-2244.
[36]
Rui Yan. 2018. ”Chitty-Chitty-Chat Bot”: Deep Learning for Conversational AI. In IJCAI'18. 5520-5526.
[37]
Lili Yao, Nanyun Peng, Weischedel Ralph, Kevin Knight, Dongyan Zhao, and Rui Yan. 2019. Plan-And-Write: Towards Better Automatic Storytelling. In AAAI'19.
[38]
Lili Yao, Yaoyuan Zhang, Yansong Feng, Dongyan Zhao, and Rui Yan. 2017. Towards implicit content-introducing for generative short-text conversation systems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2190-2199.
[39]
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. In AAAI. 2852-2858.

Cited By

View all
  • (2021)A Transformer-based Framework for Neutralizing and Reversing the Political Polarity of News ArticlesProceedings of the ACM on Human-Computer Interaction10.1145/34491395:CSCW1(1-26)Online publication date: 13-Apr-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BoFGAN
  2. constrained generation
  3. reinforcement learning

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)3
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)A Transformer-based Framework for Neutralizing and Reversing the Political Polarity of News ArticlesProceedings of the ACM on Human-Computer Interaction10.1145/34491395:CSCW1(1-26)Online publication date: 13-Apr-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media