Abstract
Shared-private models can significantly improve the performance of cross-domain learning. These methods use a shared encoder for all domains and a private encoder for each domain. One issue is that domain-specific knowledge is separately learned, without interaction with each other. We consider tackling this problem through a shared-private LSTM (SP-LSTM), which allow domain-specific parameters to be updated on a three-dimensional recurrent neural network. The advantage of SP-LSTM is that it allows domain-private information to communicate with each other during the encoding process, and it is faster than LSTM due to the parallel mechanism. Results on text classification across 16 domains indicate that SP-LSTM outperforms state-of-the-art shared-private architecture.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Daume III, H., Marcu, D.: Domain adaptation for statistical classifiers. J. Artif. Intell. Res. 26, 101–126 (2006)
Daumé III, H.: Frustratingly easy domain adaptation. arXiv preprint arXiv:0907.1815 (2009)
Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of EMNLP, pp. 120–128 (2006)
Chen, M., Weinberger, K.Q., Blitzer, J.: Co-training for domain adaptation. In: Advances in Neural Information Processing Systems, pp. 2456–2464 (2011)
Blitzer, J., Foster, D.P., Kakade, S.M.: Domain adaptation with coupled subspaces (2011)
McDonald, R., Petrov, S., Hall, K.: Multi-source transfer of delexicalized dependency parsers. In: Proceedings of EMNLP, pp. 62–72 (2011)
Lin, B.Y., Lu, W.: Neural adaptation layers for cross-domain named entity recognition. arXiv preprint arXiv:1810.06368 (2018)
Liu, P., Qiu, X., Huang, X.: Adversarial multi-task learning for text classification. arXiv preprint arXiv:1704.05742 (2017)
Alonso, H.M., Plank, B.: When is multitask learning effective? Semantic sequence prediction under varying data conditions. arXiv preprint arXiv:1612.02251 (2016)
Ruder, S., Bingel, J., Augenstein, I., et al.: Sluice networks: learning what to share between loosely related tasks. stat 23, 1050 (2017)
Braud, C., Plank, B., Søgaard, A.: Multi-view and multi-task training of RST discourse parsers. In: Proceedings of COLING 2016, pp. 1903–1913 (2016)
Rasooli, M.S., Tetreault, J.: Yara parser: A fast and accurate dependency parser. arXiv preprint arXiv:1503.06733 (2015)
Goodman, J., Vlachos, A., Naradowsky, J.: Noise reduction and targeted exploration in imitation learning for abstract meaning representation parsing. In: Proceedings of ACL, vol. 1, pp. 1–11 (2016)
Cao, P., Chen, Y., Liu, K., et al.: Adversarial transfer learning for chinese named entity recognition with self-attention mechanism. In: Proceedings of EMNLP, pp. 182–192 (2018)
Li, S., Zong, C.: Multi-domain sentiment classification. In: Proceedings of ACL, pp. 257–260 (2008)
Chen, X., Cardie, C.: Multinomial adversarial networks for multi-domain text classification. arXiv preprint arXiv:1802.05694 (2018)
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016)
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of CVPR, pp. 4293–4302 (2016)
Liu, Q., Zhang, Y., Liu, J.: Learning domain representation for multi-domain sentiment classification. In: Proceedings of ACL, Volume 1 (Long Papers), pp. 541–550 (2018)
Mou, L., Meng, Z., Yan, R., et al.: How transferable are neural networks in NLP applications? arXiv preprint arXiv:1603.06111 (2016)
Bingel, J., Søgaard, A.: Identifying beneficial task relations for multi-task learning in deep neural networks. arXiv preprint arXiv:1702.08303 (2017)
Bollmann, M., Søgaard, A., Bingel, J.: Multi-task learning for historical text normalization: size matters. In: Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP, pp. 19–24 (2018)
Augenstein, I., Ruder, S., Søgaard, A.: Multi-task learning of pairwise sequence classification tasks over disparate label spaces. arXiv preprint arXiv:1802.09913 (2018)
Shi, P., Teng, Z., Zhang, Y.: Exploiting mutual benefits between syntax and semantic roles using neural network. In: Proceedings of EMNLP, pp. 968–974 (2016)
Søgaard, A., Goldberg, Y.: Deep multi-task learning with low level tasks supervised at lower layers. In: Proceedings of ACL, vol. 2, pp. 231–235 (2016)
Zhang, M., Zhang, Y., Fu, G.: End-to-end neural relation extraction with global optimization. In: Proceedings of EMNLP, pp. 1730–1740 (2017)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680 (2014)
Srivastava, N., Hinton, G., Krizhevsky, A., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Zhang, Y., Liu, Q., Song, L.: Sentence-state LSTM for text representation. arXiv preprint arXiv:1805.02474 (2018)
Acknowledgments
The work described in this paper was initiated and partly done when Haiming Wu was visiting Westlake University. It is supported by the National Statistical Science Research Project of China under Grant No. 2016LY98, the National Natural Science Foundation of China under Grant No. 61876205, the Science and Technology Department of Guangdong Province in China under Grant Nos. 2016A010101020, 2016A010101021 and 2016A010101022, the Science and Technology Plan Project of Guangzhou under Grant Nos. 201802010033 and 201804010433.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, H., Zhang, Y., Jin, X., Xue, Y., Wang, Z. (2019). Shared-Private LSTM for Multi-domain Text Classification. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-32236-6_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)