research-article

Vietnamese Sentiment Analysis: An Overview and Comparative Study of Fine-tuning Pretrained Language Models

Authors:

Duong Ngoc Hao,

Ngan Luu-Thuy NguyenAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 22, Issue 6

Article No.: 166, Pages 1 - 27

https://doi.org/10.1145/3589131

Published: 16 June 2023 Publication History

Abstract

Sentiment Analysis (SA) is one of the most active research areas in the Natural Language Processing (NLP) field due to its potential for business and society. With the development of language representation models, numerous methods have shown promising efficiency in fine-tuning pre-trained language models in NLP downstream tasks. For Vietnamese, many available pre-trained language models were also released, including the monolingual and multilingual language models. Unfortunately, all of these models were trained on different architectures, pre-trained data, and pre-processing steps; consequently, fine-tuning these models can be expected to yield different effectiveness. In addition, there is no study focusing on evaluating the performance of these models on the same datasets for the SA task up to now. This article presents a fine-tuning approach to investigate the performance of different pre-trained language models for the Vietnamese SA task. The experimental results show the superior performance of the monolingual PhoBERT model and ViT5 model in comparison with previous studies and provide new state-of-the-art performances on five benchmark Vietnamese SA datasets. To the best of our knowledge, our study is the first attempt to investigate the performance of fine-tuning Transformer-based models on five datasets with different domains and sizes for the Vietnamese SA task.

References

[1]

Ibrahim Abu Farha and Walid Magdy. 2021. A comparative study of effective approaches for Arabic sentiment analysis. Inf. Process. Manage. 58, 2 (2021), 102438.

[2]

Marvin M. Agüero-Torales, José I. Abreu Salas, and Antonio G. López-Herrera. 2021. Deep learning and multilingual sentiment analysis on social media data: An overview. Appl. Soft Comput. 107 (2021), 107373.

[3]

Mahmoud Al-Ayyoub, Abed Allah Khamaiseh, Yaser Jararweh, and Mohammed N. Al-Kabi. 2019. A comprehensive survey of arabic sentiment analysis. Inf. Process. Manage. 56, 2 (2019), 320–342.

[4]

Ngo Xuan Bach and Tu Minh Phuong. 2015. Leveraging user ratings for resource-poor sentiment classification. Proc. Comput. Sci. 60 (2015), 322–331.

[5]

Tran Sy Bang, Choochart Haruechaiyasak, and Virach Sornlertlamvanich. 2015. Vietnamese sentiment analysis based on term feature selection approach. In Proceedings of the 10th International Conference on Knowledge Information and Creativity Support Systems. Springer, 196–204.

[6]

Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The long-document transformer. CoRR abs/2004.05150 (2020). arXiv:2004.05150 https://arxiv.org/abs/2004.05150

[7]

Marouane Birjali, Mohammed Kasri, and Abderrahim Beni-Hssane. 2021. A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowl.-Bas. Syst. 226 (2021), 107134.

[8]

Kay Henning Brodersen, Cheng Soon Ong, Klaas Enno Stephan, and Joachim M. Buhmann. 2010. The balanced accuracy and its posterior distribution. In Proceedings of the 20th International Conference on Pattern Recognition. IEEE, 3121–3124.

Digital Library

[9]

The Viet Bui, Thi Oanh Tran, and Phuong Le-Hong. 2020. Improving sequence tagging for vietnamese text using transformer-based neural models. In Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation. Association for Computational Linguistics, Hanoi, Vietnam, 13–20.

[10]

Kevin Clark, Minh-Thang Luong, Quoc Le, and Christopher D. Manning. 2020. Pre-training transformers as energy-based cloze models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 285–294.

[11]

Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 8440–8451.

[12]

Thin Dang, Vu Nguyen, Nguyen Kiet, and Nguyen Ngan. 2019. A transformation method for aspect-based sentiment analysis. J. Comput. Sci. Cybernet. 34, 4 (2019), 323–333.

[13]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186.

[14]

Ming Ding, Chang Zhou, Hongxia Yang, and Jie Tang. 2020. CogLTX: Applying BERT to long texts. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 12792–12804.

[15]

Jesse Dodge, Gabriel Ilharco, Roy Schwartz, Ali Farhadi, Hannaneh Hajishirzi, and Noah A. Smith. 2020. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping. arXiv:2002.06305. Retrieved from https://arxiv.org/abs/2002.06305.

[16]

Huu-Thanh Duong and Tram-Anh Nguyen, Thi. 2021. A review: Preprocessing techniques and data augmentation for sentiment analysis. Comput. Soc. Netw. 8, 1 (2021), 1–16.

[17]

Nguyen Thi Duyen, Ngo Xuan Bach, and Tu Minh Phuong. 2014. An empirical study on sentiment analysis for vietnamese. In Proceedings of the International Conference on Advanced Technologies for Communications. IEEE, 309–314.

[18]

Steven Y. Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, and Eduard Hovy. 2021. A survey of data augmentation approaches for NLP. In Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP’21). Association for Computational Linguistics, 968–988.

[19]

Zhengjie Gao, Ao Feng, Xinyu Song, and Xi Wu. 2019. Target-dependent sentiment classification with BERT. IEEE Access 7 (2019), 154290–154299.

[20]

Jose Angel Gonzalez, Lluís-F. Hurtado, and Ferran Pla. 2021. TWilBert: Pre-trained deep bidirectional transformers for spanish twitter. Neurocomputing 426 (2021), 58–69.

[21]

Margherita Grandini, Enrico Bagli, and Giorgio Visani. 2020. Metrics for multi-class classification: An overview. arxiv:2008.05756. Retrieved from https://arxiv.org/abs/2008.05756.

[22]

Demi Guo, Alexander Rush, and Yoon Kim. 2021. Parameter-efficient transfer learning with diff pruning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 4884–4896.

[23]

Quang-Thuy Ha, Tien-Thanh Vu, Huyen-Trang Pham, and Cong-To Luu. 2011. An upgrading feature-based opinion mining model on vietnamese product reviews. In Active Media Technology, Ning Zhong, Vic Callaghan, Ali A. Ghorbani, and Bin Hu (Eds.). Springer, Berlin, 173–185.

[24]

Quang-Vinh Ha, Bao-Dai Nguyen-Hoang, and Minh-Quoc Nghiem. 2016. Lifelong learning for cross-domain vietnamese sentiment classification. In Computational Social Networks, Hien T. Nguyen and Vaclav Snasel (Eds.). Springer International Publishing, Cham, 298–308.

[25]

Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Pham, Duc-Vu Nguyen, Kiet Van Nguyen, and Ngan Luu-Thuy Nguyen. 2020. Emotion recognition for vietnamese social media text. In Computational Linguistics, Le-Minh Nguyen, Xuan-Hieu Phan, Kôiti Hasida, and Satoshi Tojo (Eds.). Springer, Singapore, Singapore, 319–333.

[26]

Suong N. Hoang, Linh V. Nguyen, Tai Huynh, and Vuong T. Pham. 2019. An efficient model for sentiment analysis of electronic product reviews in vietnamese. In Future Data and Security Engineering, Tran Khanh Dang, Josef Küng, Makoto Takizawa, and Son Ha Bui (Eds.). Springer International Publishing, Cham, 132–142.

[27]

Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin De Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. 2019. Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. PMLR, 2790–2799.

[28]

Yong Huang, Siwei Liu, Liangdong Qu, and Yongsheng Li. 2020. Effective vietnamese sentiment analysis model using sentiment word embedding and transfer learning. In Data Science, Pinle Qin, Hongzhi Wang, Guanglu Sun, and Zeguang Lu (Eds.). Springer, Singapore, Singapore, 36–46.

[29]

Thien Ho Huong and Vinh Truong Hoang. 2020. A data augmentation technique based on text for vietnamese sentiment analysis. In Proceedings of the 11th International Conference on Advances in Information Technology (IAIT2020). Association for Computing Machinery, New York, NY, Article 13, 5 pages.

Digital Library

[30]

Huy Duc Huynh, Hang Thi-Thuy Do, Kiet Van Nguyen, and Ngan Thuy-Luu Nguyen. 2020. A simple and efficient ensemble classifier combining multiple neural network models on social media datasets in vietnamese. In Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation. Association for Computational Linguistics, 420–429.

[31]

Ashish Katrekar and Big Data Analytics AVP. 2005. An Introduction to Sentiment Analysis. GlobalLogic Inc.

[32]

Thien Khai Tran and Tuoi Thi Phan. 2019. Deep learning application to ensemble learning-the simple, but effective, approach to sentiment classifying. Appl. Sci. 9, 13 (2019).

[33]

Binh Thanh Kieu and Son Bao Pham. 2010. Sentiment analysis for vietnamese. In Proceedings of the 2nd International Conference on Knowledge and Systems Engineering. IEEE, 152–157.

Digital Library

[34]

Lac Si Le, Dang Van Thin, Ngan Luu-Thuy Nguyen, and Son Quoc Trinh. 2020. A multi-filter BiLSTM-CNN architecture for vietnamese sentiment analysis. In Advances in Computational Collective Intelligence, Marcin Hernes, Krystian Wojtkiewicz, and Edward Szczerbicki (Eds.). Springer International Publishing, Cham, 752–763.

[35]

Mingzheng Li, Lei Chen, Jing Zhao, and Qiang Li. 2021. Sentiment analysis of chinese stock reviews based on BERT model. Appl. Intell. 51, 7 (2021), 1–9.

Digital Library

[36]

Menggang Li, Wenrui Li, Fang Wang, Xiaojun Jia, and Guangwei Rui. 2021. Applying BERT to analyze investor sentiment in stock market. Neural Comput. Appl. 33, 10 (2021), 4663–4676.

Digital Library

[37]

Bing Liu. 2012. Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5, 1 (2012), 1–167.

[38]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv e-prints, arXiv.1907.

[39]

Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In Proceedings of the 7th International Conference on Learning Representations (ICLR’19, New Orleans, LA, USA, May 6-9, 2019). OpenReview.net. https://openreview.net/forum?id=Bkg6RiCqY7.

[40]

Son Luu, Kiet Nguyen, and Ngan Nguyen. 2020. Empirical study of text augmentation on social media text in vietnamese. In Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation. Association for Computational Linguistics, 462–470.

[41]

Manish Munikar, Sushil Shakya, and Aakash Shrestha. 2019. Fine-grained sentiment classification using BERT. In Proceedings of the Artificial Intelligence for Transforming Business and Society (AITB’19), Vol. 1. IEEE, 1–5.

[42]

Hong Nam Nguyen, Thanh Van Le, Hai Son Le, and Tran Vu Pham. 2014. Domain specific sentiment dictionary for opinion mining of vietnamese text. In Multi-disciplinary Trends in Artificial Intelligence. Springer International Publishing, Cham, 136–148.

[43]

Cuong Nguyen, Khiem Le, Anh Tran, and Binh Nguyen. 2020. Knowledge innovation through Intelligent software methodologies, tools and techniques. In An Efficient Framework for Vietnamese Sentiment Classification, Vol. 327. IOS Press, 343–354.

[44]

Dat Quoc Nguyen and Anh Tuan Nguyen. 2020. PhoBERT: Pre-trained language models for vietnamese. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’20). Association for Computational Linguistics, 1037–1042.

[45]

Huyen Nguyen, Hung Nguyen, Quyen Ngo, Luong Vu, Vu Tran, Bach Ngo, and Cuong Le. 2019. VLSP shared task: Sentiment analysis. J. Comput. Sci. Cybernet. 34, 4 (2019), 295–310.

[46]

H. Nguyen and Q. Nguyen. 2018. An ensemble of shallow and deep learning algorithms for vietnamese sentiment analysis. In Proceedings of the 5th NAFOSTED Conference on Information and Computer Science. IEEE, 165–170.

[47]

Hien D. Nguyen, Tai Huynh, Suong N. Hoang, Vuong T. Pham, and Ivan Zelinka. 2020. Language-oriented sentiment analysis based on the grammar structure and improved self-attention network. In Proceedings of the Evaluation of Novel Approaches to Software Engineering (ENASE’20). 339–346.

[48]

Khang Phuoc-Quy Nguyen and Nguyen Van Kiet. 2020. Exploiting vietnamese social media characteristics for textual emotion recognition in vietnamese. In Proceedings of the International Conference on Asian Language Processing. IEEE, 276–281.

[49]

Kiet Van Nguyen, Vu Duc Nguyen, Phu Xuan Vinh Nguyen, and Tham Thi Hong Truong; Ngan Luu-Thuy Nguyen. 2018. UIT-VSFC: Vietnamese students’ feedback corpus for sentiment analysis. In Proceedings of the 10th International Conference on Knowledge and Systems Engineering. IEEE, 19–24.

[50]

Phu X. V. Nguyen, Tham T. T. Hong, Kiet Van Nguyen, and Ngan Luu-Thuy Nguyen. 2018. Deep learning versus traditional classifiers on vietnamese students’ feedback corpus. In Proceedings of the 5th NAFOSTED Conference on Information and Computer Science (NICS). IEEE, 75–80.

[51]

Quan Nguyen, Ly Vu, and Quang Uy Nguyen. 2020. A two-channel model for representation learning in vietnamese sentiment classification problem. J. Comput. Sci. Cybernet. 36, 4 (2020), 305–323.

[52]

Quoc Thai Nguyen, Thoai Linh Nguyen, Ngoc Hoang Luong, and Quoc Hung Ngo. 2020. Fine-tuning BERT for sentiment analysis of vietnamese reviews. In Proceedings of the 7th NAFOSTED Conference on Information and Computer Science. IEEE, 302–307.

[53]

Vu Duc Nguyen, Kiet Van Nguyen, and Ngan Luu-Thuy Nguyen. 2018. Variants of long short-term memory for sentiment analysis on vietnamese students’ feedback corpus. In Proceedings of the 10th International Conference on Knowledge and Systems Engineering. IEEE, 306–311.

[54]

Vu Duc Nguyen, Kiet Van Nguyen, and Ngan Luu-Thuy Nguyen. 2018. Variants of long short-term memory for sentiment analysis on vietnamese students’ feedback corpus. In Proceedings of the 10th International Conference on Knowledge and Systems Engineering. IEEE, 306–311.

[55]

Dang-Khoa Nguyen-Nhat and Huu-Thanh Duong. 2019. One-document training for vietnamese sentiment analysis. In Computational Data and Social Networks, Andrea Tagarelli and Hanghang Tong (Eds.). Springer International Publishing, Cham, 189–200.

[56]

Thuy Nguyen-Thanh and Giang Tran Cong Tran. 2019. Vietnamese sentiment analysis for hotel review based on overfitting training and ensemble learning. In Proceedings of the 10th International Symposium on Information and Communication Technology. Association for Computing Machinery, 147–153.

Digital Library

[57]

Bich-Tuyen Nguyen-Thi and Huu-Thanh Duong. 2019. A vietnamese sentiment analysis system based on multiple classifiers with enhancing lexicon features. In Industrial Networks and Intelligent Systems, Trung Quang Duong, Nguyen-Son Vo, Loi K. Nguyen, Quoc-Tuan Vien, and Van-Dinh Nguyen (Eds.). Springer International Publishing, Cham, 240–249.

[58]

Denilson Alves Pereira. 2021. A survey of sentiment analysis in the portuguese language. Artif. Intell. Rev. 54, 2 (2021), 1087–1115.

Digital Library

[59]

Long Phan, Hieu Tran, Hieu Nguyen, and Trieu H. Trinh. 2022. ViT5: Pretrained text-to-text transformer for vietnamese language generation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop. Association for Computational Linguistics, 136–142.

[60]

Vo Ngoc Phu, Vo Thi Ngoc Chau, Vo Thi Ngoc Tran, and Nguyen Duy Dat. 2018. A vietnamese adjective emotion dictionary based on exploitation of Vietnamese language characteristics. Artif. Intell. Rev. 50, 1 (2018), 93–159.

Digital Library

[61]

Vo Ngoc Phu, Vo Thi Ngoc Chau, Vo Thi Ngoc Tran, Dat Nguyen Duy, and Khanh Ly Doan Duy. 2019. A valence-totaling model for vietnamese sentiment classification. Evolv. Syst. 10, 3 (2019), 453–499.

[62]

Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, and Rada Mihalcea. 2023. Beneath the tip of the iceberg: Current challenges and new directions in sentiment analysis research. IEEE Trans. Affect. Comput. 14, 1 (2023), 108–132.

Digital Library

[63]

Marco Pota, Mirko Ventura, Rosario Catelli, and Massimo Esposito. 2021. An effective BERT-based pipeline for twitter sentiment analysis: A case study in italian. Sensors 21, 1 (2021), 133.

[64]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J Liu, et al. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer.J. Mach. Learn. Res. 21, 140 (2020), 1–67.

[65]

Biswarup Ray, Avishek Garain, and Ram Sarkar. 2021. An ensemble-based hotel recommender system using sentiment analysis and aspect categorization of hotel reviews. Appl. Soft Comput. 98 (2021), 106935.

[66]

Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, and Iryna Gurevych. 2021. How good is your tokenizer? On the monolingual performance of multilingual language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 3118–3135.

[67]

Mrityunjay Singh, Amit Kumar Jakhar, and Shivam Pandey. 2021. Sentiment analysis on the impact of coronavirus in social life using the BERT model. Soc. Netw. Anal. Min. 11, 1 (2021), 1–11.

[68]

Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. How to fine-tune BERT for text classification? In Chinese Computational Linguistics, Maosong Sun, Xuanjing Huang, Heng Ji, Zhiyuan Liu, and Yang Liu (Eds.). Springer International Publishing, Cham, 194–206.

Digital Library

[69]

Thien Khai Tran and Tuoi Thi Phan. 2015. Constructing sentiment ontology for vietnamese reviews. In Proceedings of the 17th International Conference on Information Integration and Web-Based Applications and Services (iiWAS’15). Association for Computing Machinery, New York, NY, Article 36, 5 pages.

Digital Library

[70]

Thien Khai Tran and Tuoi Thi Phan. 2016. Computing sentiment scores of adjective phrases for vietnamese. In Multi-disciplinary Trends in Artificial Intelligence, Chattrakul Sombattheera, Frieder Stolzenburg, Fangzhen Lin, and Abhaya Nayak (Eds.). Springer International Publishing, Cham, 288–296.

[71]

Thien Khai Tran and Tuoi Thi Phan. 2016. Multi-class opinion classification for Vietnamese hotel reviews. Int. J. Intell. Technol. Appl. Stat. 9, 1 (2016), 7–18.

[72]

Thien Khai Tran and Tuoi Thi Phan. 2018. A hybrid approach for building a Vietnamese sentiment dictionary. J. Intell. Fuzzy Syst. 35, 1 (2018), 967–978.

Digital Library

[73]

Thien Khai Tran and Tuoi Thi Phan. 2018. Towards a sentiment analysis model based on semantic relation analysis. Int. J. Synth. Emot. 9, 2 (2018), 54–75.

Digital Library

[74]

Thien Khai Tran and Tuoi Thi Phan. 2020. Capturing contextual factors in sentiment classification: An ensemble approach. IEEE Access 8 (2020), 116856–116865.

[75]

Son Trinh, Luu Nguyen, and Minh Vo. 2018. Combining Lexicon-Based and Learning-Based Methods for Sentiment Analysis for Product Reviews in Vietnamese Language. Springer International Publishing, Cham, 57–75.

[76]

Son Trinh, Luu Nguyen, Minh Vo, and Phuc Do. 2016. Lexicon-Based Sentiment Analysis of Facebook Comments in Vietnamese Language. Springer International Publishing, Cham, 263–276.

[77]

Trong-Loc Truong, Hanh-Linh Le, and Thien-Phuc Le Dang. 2020. Sentiment analysis implementing BERT-based pre-trained language model for vietnamese. In Proceedings of the 7th NAFOSTED Conference on Information and Computer Science. IEEE, 362–367.

[78]

Huynh Quoc Viet Vo and Kazuhide Yamamoto. 2018. VietSentiLex: A sentiment dictionary that considers the polarity of ambiguous sentiment words. In Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation. Association for Computational Linguistics.

[79]

Hung T. Vo, Hai C. Lam, Duc Dung Nguyen, and Nguyen Huynh Tuong. 2016. Topic classification and sentiment analysis for vietnamese education survey system. As. J. Comput. Sci. Inf. Technol. 6, 3 (2016), 27–34.

[80]

Quan Vo, Huy Nguyen, Bac Le, and Minh Nguyen. 2017. Multi-channel LSTM-CNN model for vietnamese sentiment analysis. In Proceedings of the 9th International Conference on Knowledge and Systems Engineering. IEEE, 24–29.

[81]

Thanh Hung Vo, Thien Tin Nguyen, Hoang Anh Pham, and Thanh Van Le. 2017. An efficient hybrid model for vietnamese sentiment analysis. In Intelligent Information and Database Systems, Ngoc Thanh Nguyen, Satoshi Tojo, Le Minh Nguyen, and Bogdan Trawiński (Eds.). Springer International Publishing, Cham, 227–237.

[82]

Thanh Vu, Dat Quoc Nguyen, Dai Quoc Nguyen, Mark Dras, and Mark Johnson. 2018. VnCoreNLP: A vietnamese natural language processing toolkit. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, 56–60.

[83]

Xuan-Son Vu and Seong-Bae Park. 2014. Construction of vietnamese sentiwordnet by using Vietnamese dictionary. In Proceedings of the Korea Information Processing Society Conference. Korea Information Processing Society, 745–748.

[84]

Jason Wei and Kai Zou. 2019. EDA: Easy data augmentation techniques for boosting performance on text classification tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Hong Kong, China, 6382–6388.

[85]

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38–45.

[86]

Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2021. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 483–498.

[87]

Ashima Yadav and Dinesh Kumar Vishwakarma. 2020. Sentiment analysis using deep learning architectures: A review. Artificial Intelligence Review 53, 6 (2020), 4335–4385.

Digital Library

[88]

Yuan Yao, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan Liu, Leyu Lin, Maosong Sun, and Jianyong Wang. 2022. Prompt tuning for discriminative pre-trained language models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’22). Association for Computational Linguistics, Dublin, Ireland, 3468–3473.

[89]

Lei Zhang, Shuai Wang, and Bing Liu. 2018. Deep learning for sentiment analysis: A survey. Data Min. Knowl. Discov. 8, 4 (2018), e1253.

Cited By

Elmitwalli SMehegan J(2024)Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniquesFrontiers in Big Data10.3389/fdata.2024.13579267Online publication date: 20-Mar-2024
https://doi.org/10.3389/fdata.2024.1357926
Aliyu YSarlan AUsman Danyaro KRahman AAbdullahi M(2024)Sentiment Analysis in Low-Resource Settings: A Comprehensive Review of Approaches, Languages, and Data SourcesIEEE Access10.1109/ACCESS.2024.339863512(66883-66909)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3398635
Henríquez PAlessandri F(2024)Analyzing digital societal interactions and sentiment classification in Twitter (X) during critical events in ChileHeliyon10.1016/j.heliyon.2024.e3257210:12(e32572)Online publication date: Jun-2024
https://doi.org/10.1016/j.heliyon.2024.e32572
Show More Cited By

Index Terms

Vietnamese Sentiment Analysis: An Overview and Comparative Study of Fine-tuning Pretrained Language Models
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

A Vietnamese adjective emotion dictionary based on exploitation of Vietnamese language characteristics

Emotion classification is used in many commercial applications and research applications. The semantic classification models (or sentiment classification methods) are based on the vocabulary of the emotion dictionary being studied and being used very ...
Automatic Indonesian Sentiment Lexicon Curation with Sentiment Valence Tuning for Social Media Sentiment Analysis
Special issue on Deep Learning for Low-Resource Natural Language Processing, Part 1 and Regular Papers

A novel Indonesian sentiment lexicon (SentIL -- Sentiment Indonesian Lexicon) is created with an automatic pipeline; from creating sentiment seed words, adding new words with slang words, emoticons, and from the given dictionary and sentiment corpus, ...
English- Vietnamese Cross-Language Paraphrase Identification Method
SoICT '17: Proceedings of the 8th International Symposium on Information and Communication Technology

Paraphrase identification is a very important problem and is used in many natural language processing tasks such as machine translation, bilingual information retrieval, plagiarism detection, etc. With the development of information technology and the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 22, Issue 6

June 2023

635 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3604597

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 June 2023

Online AM: 04 April 2023

Accepted: 17 March 2023

Revised: 17 May 2022

Received: 22 December 2021

Published in TALLIP Volume 22, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

The VNUHCM-University of Information Technology’s Scientific Research Support Fund

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
653
Total Downloads

Downloads (Last 12 months)257
Downloads (Last 6 weeks)8

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Elmitwalli SMehegan J(2024)Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniquesFrontiers in Big Data10.3389/fdata.2024.13579267Online publication date: 20-Mar-2024
https://doi.org/10.3389/fdata.2024.1357926
Aliyu YSarlan AUsman Danyaro KRahman AAbdullahi M(2024)Sentiment Analysis in Low-Resource Settings: A Comprehensive Review of Approaches, Languages, and Data SourcesIEEE Access10.1109/ACCESS.2024.339863512(66883-66909)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3398635
Henríquez PAlessandri F(2024)Analyzing digital societal interactions and sentiment classification in Twitter (X) during critical events in ChileHeliyon10.1016/j.heliyon.2024.e3257210:12(e32572)Online publication date: Jun-2024
https://doi.org/10.1016/j.heliyon.2024.e32572
Nhi LVu DPhong VThang LTran Quoc TLuong H(2024)SVSD: A Comprehensive Framework for Vietnamese Sentiment AnalysisFuture Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications10.1007/978-981-96-0434-0_26(349-357)Online publication date: 24-Nov-2024
https://doi.org/10.1007/978-981-96-0434-0_26
Thin DHao DNguyen N(2023)A Study of Vietnamese Sentiment Classification with Ensemble Pre-Trained Language ModelsVietnam Journal of Computer Science10.1142/S219688882350017311:01(137-165)Online publication date: 7-Dec-2023
https://doi.org/10.1142/S2196888823500173

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents