Representation learning from noisy user-tagged data for sentiment classification

Chen, Long; Wang, Fei; Yang, Ruijing; Xie, Fei; Wang, Wenjing; Xu, Cai; Zhao, Wei; Guan, Ziyu

doi:10.1007/s13042-022-01622-7

Representation learning from noisy user-tagged data for sentiment classification

Original Article
Published: 05 August 2022

Volume 13, pages 3727–3742, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Long Chen¹^na1,
Fei Wang¹^na1,
Ruijing Yang²,
Fei Xie ORCID: orcid.org/0000-0003-1867-7194³,
Wenjing Wang¹,
Cai Xu⁴,
Wei Zhao⁴ &
…
Ziyu Guan⁴

1106 Accesses
1 Altmetric
Explore all metrics

Abstract

Sentiment classification aims to identify the sentiment orientation of an opinionated text, which is widely used for market research, product recommendation, and etc. Supervised deep learning approaches are prominent in sentiment classification and have shown the power in representation learning, however such methods suffer from the costly human annotations. Massive user-tagged opinionated texts on the Internet provide a new source for annotation, such as twitter with emoji. However, the texts may contain noisy labels, which may cause ambiguity during training process. In this paper, we propose a novel Weakly-supervised Anti-noise Contrastive Learning framework for sentiment classification, and name it as WACL. We first adopt the supervised contrastive training strategy during the pre-training phase to fully explore potential contrast patterns of weakly-labeled data to learn robust representations. Then we design a simple dropping-layer strategy to remove the top layers from the pre-trained model that are susceptible to noisy data. Last, we add a classification layer on top of the remaining model and fine tune it with labeled data. The proposed framework can learn rich contrastive sentiment patterns in the case of label noise and is applicable to a variety of deep encoders. The experimental results on the Amazon product review, Twitter and SST5 datasets demonstrate the superiority of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Discriminative Neural Sentiment Units for Semi-supervised Target-Level Sentiment Classification

An Indonesian Sentiment Classification Model Based on Multi-task Learning

Deep Transfer Learning for Social Media Cross-Domain Sentiment Classification

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Data availibility statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Notes

One category has a fixed possibility of being labeled as another.
We use the pre-trained bert-base with 12 layers and 768 hidden units, https://storage.googleapis.com/bert_models/2020_02_20/uncased_L-12_H-768_A-12.zip.
Emojis are not used as an input to train the network.
http://help.sentiment140.com/for-students/.
https://github.com/datastories-semeval2017-task4.
http://nlp.stanford.edu/sentiment.
The scores of the two different metrics are slightly different because of the equal number of samples in each class of all test sets. The macro-F1 is taking the average of the F1 scores calculated from each class, thus regarding each category equally.

References

Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 421–439
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 168–177. https://doi.org/10.1145/1014052.1014073
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pp 79–86
Li C, Gao F, Bu J, Xu L, Chen X, Gu Y, Shao Z, Zheng Q, Zhang N, Wang Y, Yu Z (2021) SentiPrompt: sentiment knowledge enhanced prompt-tuning for aspect-based sentiment analysis. CoRR arXiv:2109.08306
Timo S, Hinrich S (2021) Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, pp 255–269
Zhao W, Guan Z, Chen L, He X, Deng D, Wang B, Wang Q (2017) Weakly-supervised deep embedding for product review sentiment analysis. IEEE Trans Knowl Data Eng 30(1):185–197. https://doi.org/10.1109/TKDE.2017.2756658
Article Google Scholar
Eric X, Michael J, Stuart JR, Andrew N (2002) Distance metric learning with application to clustering with side-information. Adv Neural Inf Process Syst 15:521–528
Google Scholar
Kristina T, Anna R, Luke Z, Dilek H, Iz B, Steven B, Ryan C, Tanmoy C, Zhou Y (2021) Few-shot text classification with triplet networks, data augmentation,and curriculum learning. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics, pp 5493–5500
Guan Z, Chen L, Zhao W, Zheng Y, Tan S, Deng C (2016) Weakly-supervised deep learning for customer review sentiment classification. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp 3719–3725
Ting C, Simon K, Mohammad N, Geoffrey H (2020) A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th International Conference on Machine Learning, pp 1597–1607
John MG, Osvald N, Gary DB, Bo W (2021) Declutr: Deep contrastive learning for unsupervised textual representations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp 879–895. https://doi.org/10.18653/v1/2021.acl-long.72
Aritra G, Andrew L (2021) Contrastive learning improves model robustness under label noise. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 2703–2708
Xiao T, Xia T, Yang Y, Chang H, Wang X (2015) Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2691–2699. https://doi.org/10.1109/CVPR.2015.7298885
Goldberger J, Ben-Reuven E (2017) Training deep neural-networks using a noise adaptation layer. In: Proceedings of the 5th International Conference on Learning Representations
Ishan J, Matthew N, Xuewen C (2016) Learning deep networks from noisy labels with dropout regularization. In: 2016 IEEE 16th International Conference on Data Mining, pp 67–972. https://doi.org/10.1109/ICDM.2016.0121
Ghosh A, Kumar H, Sastry PS (2017) Robust loss functions under label noise for deep neural networks. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp 1919–1925
Zhang Z, Sabuncu MR (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, pp 8792–8802
Zhang C, Samy B, Moritz H, Benjamin R, Oriol V (2021) Understanding deep learning (still) requires rethinking generalization. Commun ACM 64(3):107–115. https://doi.org/10.1145/3446776
Article Google Scholar
Li J, Zhang M, Xu K, Dickerson PJ, Ba J (2020) Noisy labels can induce good representations. CoRR arXiv:abs/2012.12896
Liu H, Dai Z, David R, Quoc V (2021) Pay attention to MLPs. CoRR arXiv:abs/2105.08050
Zhang S, Xu X, Pang Y, Han J (2020) Multi-layer attention based CNN for target-dependent sentiment classification. Neural Process Lett 51(3):2089–2103. https://doi.org/10.1007/s11063-019-10017-9
Article Google Scholar
Habimana O, Li Y, Li R, Gu X, Yan W (2020) Attentive convolutional gated recurrent network: a contextual model to sentiment analysis. Int J Mach Learn Cyber 11:2637–2651. https://doi.org/10.1007/s13042-020-01135-1
Article Google Scholar
Al-Smadi M, Talafha B, Al-Ayyoub M, Jararweh Y (2019) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cyber 10:2163–2175. https://doi.org/10.1007/s13042-018-0799-4
Article Google Scholar
Arunava KC, Sourav D, Anup KK (2021) Sentiment analysis of Covid-19 tweets using evolutionary classification-based LSTM model. CoRR arXiv:abs/2106.06910
Ling M, Chen Q, Sun Q, Jia Y (2020) Hybrid neural network for Sina Weibo sentiment analysis. IEEE Trans Comput Soc Syst 7(4):983–990
Article Google Scholar
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, pp 4171–4186. https://doi.org/10.18653/v1/n19-1423
Ashish V, Noam S, Niki P, Jakob U, Llion J, Aidan NG, Łukasz K, Illia P (2017) Attention is all you need. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, pp 998–6008
Alec G, Richa B, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report Stanford. https://doi.org/10.1109/COMSNETS.2017.7945451
Article Google Scholar
Qu L, Gemulla R, Weikum G (2012) A weakly supervised model for sentence-level semantic orientation analysis with multiple experts. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp 149–159
Täckström O, McDonald RT (2011) Semi-supervised latent variable models for sentence-level sentiment analysis. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, pp 569–574
Wang B, Shan D, Fan A, Liu L, Gao J (2022) A sentiment classification method of web social media based on multidimensional and multilevel modeling. IEEE Trans Ind Informatics 18(2):1240–1249
Article Google Scholar
Wang F, Liu H (2021) Understanding the behaviour of contrastive loss. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2495–2504
Chen X, Gupta A (2015) Webly supervised learning of convolutional networks. In: 2015 IEEE International Conference on Computer Vision, pp 1431–1439. https://doi.org/10.1109/ICCV.2015.168
Alec G, Richa B, Huang (2014) Training convolutional networks with noisy labels. CoRR abs/1406.2080
Nitish S, Geoffrey H, Alex K, Ilya S, Ruslan S (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Bekker AJ, Goldberger J (2016) Training deep neural-networks based on unreliable labels. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, pp 2682–2686. https://doi.org/10.1109/ICASSP.2016.7472164
Cheng L, Zhou X, Zhao L, Li D, Shang H, Zheng Y, Pan P, Xu Y (2020) Weakly supervised learning with side information for noisy labeled images. In: European Conference on Computer Vision, pp 306–321. https://doi.org/10.1007/978-3-030-58577-8_19
Naresh M, PS S (2013) Noise tolerance under risk minimization. IEEE Trans Cybern 43(3):1146–1151. https://doi.org/10.1109/TSMCB.2012.2223460
Article Google Scholar
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, pp 3320–3328
Socher R, Perelygin A, Wu J, Chuang J, Manning DC, Andrew Y, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp 1631–1642
Ding X, Liu B, Philip SY (2008) A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 international conference on web search and data mining, pp 231–240. https://doi.org/10.1145/1341531.1341561
Wang S, Christopher DM (2012) Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp 90–94
Tang D, Wei F, Nan Y, Ming Z, Bing Q (2014) Learning sentiment-specific word embedding for twitter sentiment classification. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp 1555–1565. https://doi.org/10.3115/v1/p14-1146
Gunel B, Du J, Conneau A, Stoyanov V (2021) Supervised contrastive learning for pre-trained language model fine-tuning. In: 9th International Conference on Learning Representations
Ilya T, Neil H, Alexander K, Lucas B, Zhai X, Thomas U, Jessica Y, Daniel K, Jakob U, Mario L, Dosovitskiy A (2021) Mlp-mixer: an all-mlp architecture for vision. CoRR arXiv:abs/2105.01601
Tomas M, Ilya S, Chen K, Greg S, Jeff D (2013) Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, pp 3111–3119
Hu Z, Wu H, Liao S, Hu H, Liu S, Li B (2018) Person re-identification with hybrid loss and hard triplets mining. In: Fourth IEEE International Conference on Multimedia Big Data, pp 1–5. https://doi.org/10.1109/BigMM.2018.8499463
Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(2605):2579–2605
MATH Google Scholar

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant Nos. 61902316, 62133012, 61936006, 61876144, 61876145, 62073255, 62103314, 61973249, 62001381), the Key Research and Development Program of Shaanxi (Program Nos. 2020ZDLGY04-07, 2021ZDLGY02-06), Innovation Capability Support Program of Shaanxi (Program No. 2021TD-05), and Natural Science Basic Research Program of Shaanxi (Program Nos. 2022JQ-675, 2021JQ-712).

Author information

Long Chen and Fei Wang contributed equally to this work.

Authors and Affiliations

Shaanxi Key Laboratory of Information Communication Network and Security, Xi’an University of Posts and Telecommunications, Xi’an, China
Long Chen, Fei Wang & Wenjing Wang
School of Information Science and Technology, Northwest University, Xi’an, China
Ruijing Yang
Academy of Advanced Interdisciplinary Research, Xidian University, Xi’an, China
Fei Xie
School of Computer Science and Technology, Xidian University, Xi’an, China
Cai Xu, Wei Zhao & Ziyu Guan

Authors

Long Chen
View author publications
You can also search for this author inPubMed Google Scholar
Fei Wang
View author publications
You can also search for this author inPubMed Google Scholar
Ruijing Yang
View author publications
You can also search for this author inPubMed Google Scholar
Fei Xie
View author publications
You can also search for this author inPubMed Google Scholar
Wenjing Wang
View author publications
You can also search for this author inPubMed Google Scholar
Cai Xu
View author publications
You can also search for this author inPubMed Google Scholar
Wei Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Ziyu Guan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Fei Xie.

Ethics declarations

Conflict of interest

The authors declare that they have no confict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, L., Wang, F., Yang, R. et al. Representation learning from noisy user-tagged data for sentiment classification. Int. J. Mach. Learn. & Cyber. 13, 3727–3742 (2022). https://doi.org/10.1007/s13042-022-01622-7

Download citation

Received: 22 November 2021
Accepted: 22 July 2022
Published: 05 August 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s13042-022-01622-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Representation learning from noisy user-tagged data for sentiment classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Discriminative Neural Sentiment Units for Semi-supervised Target-Level Sentiment Classification

An Indonesian Sentiment Classification Model Based on Multi-task Learning

Deep Transfer Learning for Social Media Cross-Domain Sentiment Classification

Explore related subjects

Data availibility statement

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now