Skip to main content

Shared-Private LSTM for Multi-domain Text Classification

  • Conference paper
  • First Online:
  • 4728 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

Shared-private models can significantly improve the performance of cross-domain learning. These methods use a shared encoder for all domains and a private encoder for each domain. One issue is that domain-specific knowledge is separately learned, without interaction with each other. We consider tackling this problem through a shared-private LSTM (SP-LSTM), which allow domain-specific parameters to be updated on a three-dimensional recurrent neural network. The advantage of SP-LSTM is that it allows domain-private information to communicate with each other during the encoding process, and it is faster than LSTM due to the parallel mechanism. Results on text classification across 16 domains indicate that SP-LSTM outperforms state-of-the-art shared-private architecture.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Daume III, H., Marcu, D.: Domain adaptation for statistical classifiers. J. Artif. Intell. Res. 26, 101–126 (2006)

    Article  MathSciNet  Google Scholar 

  2. Daumé III, H.: Frustratingly easy domain adaptation. arXiv preprint arXiv:0907.1815 (2009)

  3. Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of EMNLP, pp. 120–128 (2006)

    Google Scholar 

  4. Chen, M., Weinberger, K.Q., Blitzer, J.: Co-training for domain adaptation. In: Advances in Neural Information Processing Systems, pp. 2456–2464 (2011)

    Google Scholar 

  5. Blitzer, J., Foster, D.P., Kakade, S.M.: Domain adaptation with coupled subspaces (2011)

    Google Scholar 

  6. McDonald, R., Petrov, S., Hall, K.: Multi-source transfer of delexicalized dependency parsers. In: Proceedings of EMNLP, pp. 62–72 (2011)

    Google Scholar 

  7. Lin, B.Y., Lu, W.: Neural adaptation layers for cross-domain named entity recognition. arXiv preprint arXiv:1810.06368 (2018)

  8. Liu, P., Qiu, X., Huang, X.: Adversarial multi-task learning for text classification. arXiv preprint arXiv:1704.05742 (2017)

  9. Alonso, H.M., Plank, B.: When is multitask learning effective? Semantic sequence prediction under varying data conditions. arXiv preprint arXiv:1612.02251 (2016)

  10. Ruder, S., Bingel, J., Augenstein, I., et al.: Sluice networks: learning what to share between loosely related tasks. stat 23, 1050 (2017)

    Google Scholar 

  11. Braud, C., Plank, B., Søgaard, A.: Multi-view and multi-task training of RST discourse parsers. In: Proceedings of COLING 2016, pp. 1903–1913 (2016)

    Google Scholar 

  12. Rasooli, M.S., Tetreault, J.: Yara parser: A fast and accurate dependency parser. arXiv preprint arXiv:1503.06733 (2015)

  13. Goodman, J., Vlachos, A., Naradowsky, J.: Noise reduction and targeted exploration in imitation learning for abstract meaning representation parsing. In: Proceedings of ACL, vol. 1, pp. 1–11 (2016)

    Google Scholar 

  14. Cao, P., Chen, Y., Liu, K., et al.: Adversarial transfer learning for chinese named entity recognition with self-attention mechanism. In: Proceedings of EMNLP, pp. 182–192 (2018)

    Google Scholar 

  15. Li, S., Zong, C.: Multi-domain sentiment classification. In: Proceedings of ACL, pp. 257–260 (2008)

    Google Scholar 

  16. Chen, X., Cardie, C.: Multinomial adversarial networks for multi-domain text classification. arXiv preprint arXiv:1802.05694 (2018)

  17. Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016)

  18. Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of CVPR, pp. 4293–4302 (2016)

    Google Scholar 

  19. Liu, Q., Zhang, Y., Liu, J.: Learning domain representation for multi-domain sentiment classification. In: Proceedings of ACL, Volume 1 (Long Papers), pp. 541–550 (2018)

    Google Scholar 

  20. Mou, L., Meng, Z., Yan, R., et al.: How transferable are neural networks in NLP applications? arXiv preprint arXiv:1603.06111 (2016)

  21. Bingel, J., Søgaard, A.: Identifying beneficial task relations for multi-task learning in deep neural networks. arXiv preprint arXiv:1702.08303 (2017)

  22. Bollmann, M., Søgaard, A., Bingel, J.: Multi-task learning for historical text normalization: size matters. In: Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP, pp. 19–24 (2018)

    Google Scholar 

  23. Augenstein, I., Ruder, S., Søgaard, A.: Multi-task learning of pairwise sequence classification tasks over disparate label spaces. arXiv preprint arXiv:1802.09913 (2018)

  24. Shi, P., Teng, Z., Zhang, Y.: Exploiting mutual benefits between syntax and semantic roles using neural network. In: Proceedings of EMNLP, pp. 968–974 (2016)

    Google Scholar 

  25. Søgaard, A., Goldberg, Y.: Deep multi-task learning with low level tasks supervised at lower layers. In: Proceedings of ACL, vol. 2, pp. 231–235 (2016)

    Google Scholar 

  26. Zhang, M., Zhang, Y., Fu, G.: End-to-end neural relation extraction with global optimization. In: Proceedings of EMNLP, pp. 1730–1740 (2017)

    Google Scholar 

  27. Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680 (2014)

    Google Scholar 

  28. Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  29. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  30. Zhang, Y., Liu, Q., Song, L.: Sentence-state LSTM for text representation. arXiv preprint arXiv:1805.02474 (2018)

Download references

Acknowledgments

The work described in this paper was initiated and partly done when Haiming Wu was visiting Westlake University. It is supported by the National Statistical Science Research Project of China under Grant No. 2016LY98, the National Natural Science Foundation of China under Grant No. 61876205, the Science and Technology Department of Guangdong Province in China under Grant Nos. 2016A010101020, 2016A010101021 and 2016A010101022, the Science and Technology Plan Project of Guangzhou under Grant Nos. 201802010033 and 201804010433.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Xue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, H., Zhang, Y., Jin, X., Xue, Y., Wang, Z. (2019). Shared-Private LSTM for Multi-domain Text Classification. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32236-6_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32235-9

  • Online ISBN: 978-3-030-32236-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics