Incorporating Sample Filtering into Subject-Based Ensemble Model for Cross-Domain Sentiment Classification

Yang, Liang; Zhang, Shaowu; Lin, Hongfei; Wei, Xianhui

doi:10.1007/978-3-319-25816-4_10

Liang Yang¹⁹,
Shaowu Zhang¹⁹,
Hongfei Lin¹⁹ &
…
Xianhui Wei¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9427))

Included in the following conference series:

7123 Accesses
1 Citations

Abstract

Recently, cross-domain sentiment classification is becoming popular owing to its potential applications, such as marketing et al. It seeks to generalize a model, which is trained on a source domain and using it to label samples in the target domain. However, the source and target distributions differ substantially in many cases. To address this issue, we propose a comprehensive model, which takes sample filtering and labeling adaptation into account simultaneously, named joint Sample Filtering with Subject-based Ensemble Model (SF-SE). Firstly, a sentence level Latent Dirichlet Allocation (LDA) model, which incorporates topic and sentiment together (SS-LDA) is introduced. Under this model, a high-quality training dataset is constructed in an unsupervised way. Secondly, inspired by the distribution variance of domain-independent and domain-specific features related to the subject of a sentence, we introduce a Subject-based Ensemble model to efficiently improve the classification performance. Experimental results show that the proposed model is effective for cross-domain sentiment classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-2002 Conference on Empirical Methods in Natural Language Processing, pp. 79–86 (2002)
Google Scholar
Pang, B., Lee, L.: Opinion mining and sentiment analysis. J. Found. Trends Inf. Retrieval 2, 1–135 (2008)
Article Google Scholar
Liu, Y., Huang, X., An, A., Yu, X.: ARSA: a sentiment-aware model for predicting sales performance using blogs. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 607–614 (2007)
Google Scholar
Yu, X., Liu, Y., Huang, X., An, A.: Mining online reviews for predicting sales performance: a case study in the movie domain. IEEE Trans. J. Knowl. Data Eng. 24, 720–734 (2012)
Article Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. J. Knowl. Data Eng. 22, 1345–1359 (2010)
Article Google Scholar
Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 120–128 (2006)
Google Scholar
Blitzer J., Dredze M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, vol. 7, pp. 440–447 (2007)
Google Scholar
Pan, S. J., Ni, X., Sun, J., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th International Conference on World Wide Web, pp. 751–760 (2010)
Google Scholar
He, Y., Lin, C., Alani, H.: Automatically extracting polarity-bearing topics for cross-domain sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 123–131 (2011)
Google Scholar
Duan, L., Xu, D., Tsang, I.: Learning with augmented features for heterogeneous domain adaptation. J. arXiv preprint (2012). arXiv:1206.4660
Jiang, J., Zhai, C.: Instance weighting for domain adaptation in NLP. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, vol. 7, pp. 264–271 (2007)
Google Scholar
Xia, R., Zong, C.: A POS-based ensemble model for cross-domain sentiment classification. In: Proceedings of the 5th International Joint Conference on Natural Language Processing, pp. 614–622. Citeseer (2011)
Google Scholar
Samdani, R., Yih, W.: Domain adaptation with ensemble of feature groups. In: Proceedings-International Joint Conference on Artificial Intelligence, vol. 22, p. 1458 (2011)
Google Scholar
Gao, J., Fan, W., Jiang, J., Han, J.: Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 283–291 (2008)
Google Scholar
Yoshida, Y., Hirao, T., Iwata, T., Nagata, M., Matsumoto, Y.: Twenty-Fifth AAAI Conference on Artificial Intelligence (2011)
Google Scholar
Lin, C., He, Y.: Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 375–384 (2009)
Google Scholar
Xia, R., Zong, C., Hu, X., Cambria, E.: Feature ensemble plus sample selection: domain adaptation for sentiment classification. J. Intell. Syst. 28, 10–18 (2013)
Article Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Lu, B., Ott, M., Cardie, C., Tsou, B.K.: Multi-aspect sentiment analysis with topic models. In: IEEE 11th International Conference on Data Mining Workshops (ICDMW), pp. 81–88 (2011)
Google Scholar
Lin, J.: Divergence measures based on the Shannon entropy. IEEE Trans. J. Inf. Theo. 37, 145–151 (1991)
Article MathSciNet Google Scholar
Fumera, G., Roli, F.: A theoretical and experimental analysis of linear combiners for multiple classifier systems. IEEE Trans. J. Pattern Analy. Mach. Intell. 27, 942–956 (2005)
Article Google Scholar
Juang, B.H., Katagiri, S.: Discriminative learning for minimum error classification. IEEE Trans. J. Signal Process. 40, 3043–3054 (1992)
Article Google Scholar
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 271 (2004)
Google Scholar

Download references

Acknowledgements

This work is partially supported by grant from the Natural Science Foundation of China (No. 61277370, 61402075), Natural Science Foundation of Liaoning Province, China (No. 201202031, 2014020003), State Education Ministry and The Research Fund for the Doctoral Program of Higher Education (No. 20090041110002), the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
Liang Yang, Shaowu Zhang, Hongfei Lin & Xianhui Wei

Authors

Liang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shaowu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hongfei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Xianhui Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongfei Lin .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Maosong Sun
Tsinghua University, Beijing, China
Zhiyuan Liu
Soochow University, Suzhou, Jiangsu, China
Min Zhang
Tsinghua University, Beijing, China
Yang Liu

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, L., Zhang, S., Lin, H., Wei, X. (2015). Incorporating Sample Filtering into Subject-Based Ensemble Model for Cross-Domain Sentiment Classification. In: Sun, M., Liu, Z., Zhang, M., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2015 2015. Lecture Notes in Computer Science(), vol 9427. Springer, Cham. https://doi.org/10.1007/978-3-319-25816-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-25816-4_10
Published: 08 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25815-7
Online ISBN: 978-3-319-25816-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics