Data augmentation for aspect-based sentiment analysis

Li, Guangmin; Wang, Hui; Ding, Yi; Zhou, Kangan; Yan, Xiaowei

doi:10.1007/s13042-022-01535-5

Data augmentation for aspect-based sentiment analysis

Original Article
Published: 18 May 2022

Volume 14, pages 125–133, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Guangmin Li ORCID: orcid.org/0000-0001-7045-572X^1,2,
Hui Wang³,
Yi Ding¹,
Kangan Zhou¹ &
…
Xiaowei Yan⁴

954 Accesses
8 Citations
Explore all metrics

Abstract

In recent years, deep learning has been widely used in the field of natural language processing (NLP), achieving spectacular successes in various NLP tasks. These successes are largely due to its capability to automatically learn feature representations from text data. However, the performance of deep learning in NLP can be negatively affected by a lack of sufficiently large labeled corpus for training, resulting in limited improvement in performance. Data augmentation overcomes this small data problem by expanding the sample size for the classes of data in the training corpus. This paper introduces the data augmentation for aspect-based sentiment analysis (ABSA), a classical research topic in NLP that has been applied in various fileds. The study aims to enhance the classification performance of ABSA through various augmentation strategies. Two specific augmentation strategies are presented, part-of-speech (PoS) wise synonym substitution (PWSS) and dependency relation-based word swap (DRAWS), which augment data using PoS, external domain knowledge, and syntactic dependency. These strategies are evaluated through extensive experimentation on four public datasets using three representative deep learning models—aspect-specific graph convolutional network (ASGCN), content attention-based aspect-based sentiment classification (CABASC), and long short-term memory (LSTM) network. Compared with the results without data augmentation, our augmentation strategies achieve a performance gain of up to 11.49% on Macro-F1, with the lowest gain being 2.9%. The experimental results demonstrate that the proposed data augmentation strategies are very useful for training deep learning models on small data corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Article 09 April 2024

Pranati Rakshit & Avik Sarkar

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Article 19 November 2021

Nirmal Varghese Babu & E. Grace Mary Kanaga

Sentiment Analysis in the Age of Generative AI

Article Open access 05 March 2024

Jan Ole Krugmann & Jochen Hartmann

Notes

References

Liu Q, Zhang H, Zeng Y, Huang Z, Wu Z (2018) Content Attention Model for Aspect Based Sentiment Analysis, in: Proceedings of the 2018 World Wide Web Conference, WWW ’18, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, pp. 1023–1032. https://doi.org/10.1145/3178876.3186001
Zhang C, Li Q, Song D Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks, arXiv:1909.03477 [cs] arXiv:1909.03477
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Johnson JM, Khoshgoftaar TM (2019) Survey on deep learning with class imbalance. J Big Data 6(1):1–54
Article Google Scholar
López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Inform Sci 250:113–141
Article Google Scholar
Thabtah F, Hammoud S, Kamalov F, Gonsalves A (2020) Data imbalance in classification: experimental evaluation. Inform Sci 513:429–441
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inform Process Syst 25:1097–1105
Google Scholar
Wang J, Perez L The effectiveness of data augmentation in image classification using deep learning, Convolutional Neural Networks Vis. Recognit 11
Singh J, McCann B, Keskar NS, Xiong C, Socher R XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering, arXiv:1905.11471 [cs] arXiv:1905.11471
Min J, McCoy RT, Das D, Pitler E, Linzen T Syntactic data augmentation increases robustness to inference heuristics, arXiv preprint arXiv:2004.11999 arXiv:2004.11999
Sennrich R, Haddow B, Birch A Improving Neural Machine Translation Models with Monolingual Data, arXiv:1511.06709 [cs] arXiv:1511.06709
Fadaee M, Bisazza A, Monz C (2017) Data Augmentation for Low-Resource Neural Machine Translation, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 567–573 arXiv:1705.00440, https://doi.org/10.18653/v1/P17-2090
Dai X, Adel H An Analysis of Simple Data Augmentation for Named Entity Recognition, arXiv:2010.11683 [cs] arXiv:2010.11683
Fellbaum C (2012). The Encyclopedia of Applied Linguistics. https://doi.org/10.1002/9781405198431.wbeal1285
Mueller J, Thyagarajan A (2016) Siamese recurrent architectures for learning sentence similarity, In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30
Wei J, Zou K (2019) EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp. 6381–6387. https://doi.org/10.18653/v1/D19-1670
Zhang X, Zhao J, LeCun Y Character-level convolutional networks for text classification, arXiv preprint arXiv:1509.01626 arXiv:1509.01626
Coulombe C Text data augmentation made simple by leveraging nlp cloud apis, arXiv preprint arXiv:1812.04718 arXiv:1812.04718
Luque FM Atalaya at tass 2019: Data augmentation and robust embeddings for sentiment analysis, arXiv preprint arXiv:1909.11241 arXiv:1909.11241
Zhang Y, Ge T, Sun X Parallel data augmentation for formality style transfer, arXiv preprint arXiv:2005.07522 arXiv:2005.07522
Xie Q, Dai Z, Hovy E, Luong M-T, Le QV Unsupervised data augmentation for consistency training, arXiv preprint arXiv:1904.12848 arXiv:1904.12848
Hu Z, Yang Z, Liang X, Salakhutdinov R, Xing EP (2017) Toward controlled generation of text, in: International Conference on Machine Learning, PMLR, 2017, pp. 1587–1596
Anaby-Tavor A, Carmeli B, Goldbraich E, Kantor A, Kour G, Shlomov S, Tepper N, Zwerdling N (2010) Do not have enough data? Deep learning to the rescue!. In: Proceedings of the AAAI Conference on Artificial Intelligence, 4:7383–7390
Li K, Chen C, Quan X, Ling Q, Song Y Conditional augmentation for aspect term extraction via masked sequence-to-sequence generation, arXiv preprint arXiv:2004.14769 arXiv:2004.14769
Kobayashi S Contextual augmentation: Data augmentation by words with paradigmatic relations, arXiv preprint arXiv:1805.06201 arXiv:1805.06201
Robinson JJ (1970) Dependency structures and transformational rules, Language 259–285
Miao Z, Li Y, Wang X, Tan W-C (2010) Snippext: Semi-supervised opinion mining with augmented data. In: Proceedings of The Web Conference 2020, pp. 617–628
Jeni LA, Cohn JF, De La Torre F (2013) Facing Imbalanced Data Recommendations for the Use of Performance Metrics, International Conference on Affective Computing and Intelligent Interaction and workshops : [proceedings]. ACII (Conference) 2013 245–251. https://doi.org/10.1109/ACII.2013.47
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543
Xu C, Wang H, Wu S, Lin Z (2021) Treelstm with tag-aware hypernetwork for sentence representation. Neurocomputing 434:11–20
Article Google Scholar

Download references

Acknowledgements

We thank Xiang Dai[13] for the great suggestion. This research was supported by Natural Science Foundation of Hubei Province of China (Grant No. 2020CFB828), Hubei Normal University Research Project on Teaching Reform (Grant No. XJ202001), Teaching Research Project of Hubei Normal University (Grant No. 2019030), Research Project of Young Teachers in Hubei Normal University (Grant No. HS2020QN029) and Science and Technology Research Project of Hubei Department of Education (Grant No. D20212503).

Author information

Authors and Affiliations

School of Computer and Information Engineering, Hubei Normal University, Huangshi, Hubei, China
Guangmin Li, Yi Ding & Kangan Zhou
School of Arts and Science, Hubei Normal University, Huangshi, China
Guangmin Li
School of Electronics, Electrical Engineering and Computer Science, Queen’s University Belfast, Belfast, UK
Hui Wang
School of Computer Science, China University of Geosciences, Wuhan, Hubei, China
Xiaowei Yan

Authors

Guangmin Li
View author publications
You can also search for this author in PubMed Google Scholar
Hui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Ding
View author publications
You can also search for this author in PubMed Google Scholar
Kangan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiaowei Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guangmin Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, G., Wang, H., Ding, Y. et al. Data augmentation for aspect-based sentiment analysis. Int. J. Mach. Learn. & Cyber. 14, 125–133 (2023). https://doi.org/10.1007/s13042-022-01535-5

Download citation

Received: 30 May 2021
Accepted: 24 February 2022
Published: 18 May 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s13042-022-01535-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data augmentation for aspect-based sentiment analysis

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Sentiment Analysis in the Age of Generative AI

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data augmentation for aspect-based sentiment analysis

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Sentiment Analysis in the Age of Generative AI

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation