1 Introduction

In today’s digitally interconnected world, understanding public sentiment and emotions expressed in textual data has become increasingly important (Abdullah and Rusli 2021). Within the realm of natural language processing (NLP), sentiment analysis stands as a crucial discipline, serving as a linchpin in decoding these nuanced emotions and opinions (Cambria et al. 2024; Wang et al. 2023a). Its applications extend across diverse fields including marketing strategies, brand management, social media monitoring, video analysis, and more, offering invaluable insights (Cui et al. 2023; Cambria et al. 2013; Wang et al. 2023b; Tan et al. 2023; Stappen et al. 2021). The ability to analyze sentiments across languages is especially crucial as the global landscape becomes more diverse and interconnected (Mihalcea et al. 2012; Araújo et al. 2020; Cambria et al. 2023).

Language diversity is an inherent characteristic of global communication, necessitating sentiment analysis techniques that can transcend linguistic barriers (Abdullah and Rusli 2021; He et al. 2022). The intricacies of language nuances and cultural variations add layers of complexity to sentiment interpretation (Wang et al. 2014a). A sentiment-bearing phrase that conveys positivity in one language might carry a different connotation in another (Xu et al. 2022). Moreover, cultural norms influence how emotions are expressed, demanding adaptable models that can decipher sentiments within different cultural contexts (Wang et al. 2020a; Ortony and Turner 1990). Multilingual sentiment analysis (Abdullah and Rusli 2021; Cui et al. 2023; Mihalcea et al. 2012; Araújo et al. 2020; Lo et al. 2017), the process of assessing emotions and opinions in different languages, addresses this need by enabling a comprehensive understanding of global sentiment trends. With multilingual analysis, we can unveil not only linguistic differences but also cultural nuances and contextual variations in sentiment expressions.

In the realm of multilingual sentiment analysis, the Chinese language holds a significant position due to its extensive usage and the unique challenges it poses (Peng et al. 2017; Yan et al. 2014; Zhang and Zhou 2023; Li et al. 2022). Chinese, with its ideographic writing system, tonal complexities, and rich idiomatic expressions, demands specialized approaches for sentiment analysis (Zhang and Zhou 2023; Li et al. 2022; Xu et al. 2019a). Moreover, given China’s influence in global markets and digital spaces, comprehending sentiment within Chinese text is essential for businesses and researchers alike. While existing literature reviews have diligently surveyed these facets, there remains an uncharted territory awaiting exploration: the thematic trends, evolution of research methodologies and topics within the realm of Chinese sentiment analysis.

This study aims to bridge the existing gap by elucidating the distinct evolution of sentiment analysis research methodologies and themes within the realm of Chinese sentiment expression. It addresses this intriguing gap by presenting a comprehensive survey that delves into the dynamic progression of sentiment analysis research methods and thematic foci. While prior literature reviews have expounded upon sentiment analysis techniques, this study takes a pioneering step by specifically examining the interplay between evolving research trends, methodologies and emergent themes, with a particular emphasis on the analysis of sentiment within Chinese text.

This paper is structured to provide a comprehensive review of Chinese sentiment analysis. Section 1 highlights the growing significance of sentiment analysis in understanding public sentiment expressed in textual data, with specific attention to the sentiment analysis within the Chinese language. Section 2 reviews the existing work and underscores the extensive exploration of sentiment analysis, with a focus on the gap in understanding sentiment within the unique linguistic and cultural framework of Chinese text, pointing out there is a need for a pioneering comprehensive survey that examines the evolution of sentiment analysis research methods and themes specifically within the realm of Chinese text. Section 3 provides the detailed survey and analysis on the trends, methods, and topics in sentiment analysis of Chinese text, outlining the evolution of methodologies in multilingual sentiment analysis, especially with a focus on the Chinese language. Section 4 explores the practical applications, limitations, and future prospects of Chinese sentiment analysis. Lastly, Sect. 5 and Sect. 6 conclude the paper and discuss potential future directions.

2 Background and importance of this survey

2.1 Background

Sentiment analysis, a cornerstone of natural language processing, has undergone extensive exploration across a multitude of languages, cultures, and contexts (Abdullah and Rusli 2021; Cui et al. 2023; Mihalcea et al. 2012; Araújo et al. 2020). A growing body of research has shed light on sentiment analysis techniques, methods, and applications, leading to a deeper understanding of human emotions and opinions as conveyed through textual content (Cambria et al. 2013; Wang et al. 2023b; Tan et al. 2023; Stappen et al. 2021). However, a critical gap in the existing literature lies in the investigation of sentiment analysis within the distinct linguistic and cultural framework of Chinese text.

Research in sentiment analysis has predominantly focused on the English language, resulting in a comprehensive understanding of sentiment expressions in this English linguistic domain (Cui et al. 2023; Wang and Lin 2020; Wang et al. 2020b). Numerous studies have explored lexicon-based approaches, machine learning techniques, and deep learning models for sentiment classification and polarity detection in English text (Cui et al. 2023). These efforts have culminated in the development of robust tools and methodologies for deciphering sentiment cues and gauging emotional tone within English textual content (Raghuvanshi and Patil 2016; Nair et al. 2019; Obiedat et al. 2021; Medhat et al. 2014; Prabha and Srikanth 2019; Ligthart et al. 2021).

In contrast, sentiment analysis within the context of Chinese text introduces a host of unique challenges (Peng et al. 2017; Zhou et al. 2022; Yang et al. 2022c; Sun et al. 2016). The rich morphological structure of Chinese characters, combined with the intricate interplay between ideographic meanings, necessitates tailored approaches to sentiment classification (Sun et al. 2016). Additionally, the tonal nature of Chinese writing and the reliance on context for accurate sentiment interpretation further complicate the sentiment analysis process (Peng et al. 2021). The authors presents a pioneering approach that leverages the modern Chinese pronunciation system, known as pinyin, to enhance Chinese sentiment analysis. By exploring phonetic information extracted from pinyin, including audio clips and pinyin token corpus, the authors developed a novel method to disambiguate intonations and integrate this phonetic information with textual and visual features. Such an approach generated innovative multimodal representations for Chinese words, significantly contributing to the improvement of sentiment analysis in Chinese across various datasets (Peng et al. 2021).

Chinese sentiment expression often relies on implicit cultural references, idiomatic phrases, and subtle context cues (Pan et al. 2019). These intricacies demand a deep understanding of Chinese culture, history, and societal norms to accurately decipher the emotional undercurrents of text. Consequently, sentiment analysis models tailored for Chinese must not only account for linguistic peculiarities but also navigate the intricate web of cultural subtext (Tan and Zhang 2008).

A wealth of research in this domain has been dedicated to exploring sentiment analysis specifically in English text, resulting in a plethora of survey papers that comprehensively summarize and analyze the existing methodologies, datasets, and challenges. However, a noticeable discrepancy arises when considering sentiment analysis in the context of Chinese text. Despite the substantial importance of Chinese as one of the world’s most widely spoken languages, the literature landscape reveals a distinct lack of comprehensive survey papers addressing sentiment analysis in this linguistic context. While numerous studies have investigated sentiment analysis techniques for Chinese text (Xu et al. 2022; Huai and Voorde 2022; Huang et al. 2022), the scarcity of surveys hampers the consolidation of knowledge and the identification of trends and gaps in this area. This glaring disparity underscores the need for further scholarly endeavors to bridge this gap, providing insights that can guide future research and advancements in Chinese text sentiment analysis. By addressing this imbalance, researchers can contribute to a more inclusive understanding of sentiment analysis across diverse languages and cultures. However, a comprehensive investigation into the evolution of research methodologies and thematic trends within this realm is conspicuously absent.

This paper fills this void by presenting a comprehensive survey that specifically probes the evolution of sentiment analysis research trends, methods, and thematic foci within the context of Chinese text.

This research not only builds upon the foundation of the previous survey paper (Cui et al. 2023) but also expands its horizons to unravel the intricate tapestry of sentiment analysis within the realm of Chinese text. By employing innovative techniques such as keyword co-occurrence analysis and community detection algorithms, our study not only pioneers an exploration of sentiment analysis within the realm of Chinese but also provides a roadmap for understanding the nuanced interplay between linguistic, cultural, and technological factors in sentiment expression.

2.2 Importance of Chinese sentiment analysis

Chinese sentiment analysis remains crucial despite the availability of advanced language models like ChatGPT and other large language models (LLMs) for several reasons:

  • Language Nuances: Chinese language and sentiments entail unique cultural and linguistic nuances that may not be fully captured by generalized models (Du et al. 2022). Cultural context, idiomatic expressions, and subtle linguistic nuances specific to Chinese language require specialized models for accurate sentiment analysis (Liao et al. 2022; Bhaskaran et al. 2022; Sun et al. 2022). For instance, idiomatic expressions such as “坐井观天” (literally “sitting in the well and looking at the sky,” which actually means “having a limited view of the world” and carries a negative connotation) and “画蛇添足” (literally “drawing legs on a snake,” which means “to overdo something unnecessary” and also carries a negative connotation) are deeply rooted in Chinese culture and require specialized models for accurate sentiment analysis. If we translate such Chinese idioms directly into English without considering their true meanings, the models may miss these subtleties, leading to incorrect sentiment interpretations.

  • Domain Specificity: Specialized models for Chinese sentiment analysis can cater to industry-specific jargon, domains, and nuances, providing more accurate analysis for various sectors like finance, healthcare, or technology (Du et al. 2022; Guo et al. 2022). For example, in the finance sector, terms like “牛市” (meaning “bull market”) directly refer to a financial market characterized by rising prices and positive investor sentiment, while “熊市” (meaning “bear market”) denotes a market trend where prices are declining and investor sentiment is generally negative. Such domain-specific terms are particular to Chinese financial discourse and require precise understanding for sentiment analysis. Similarly, medical terms used in healthcare, such as “+” (“正号”, which means “positive” in general but in the healthcare domain it would actually mean negative), illustrate how different technical domain jargons in various sectors demand domain-specific models for accurate Chinese sentiment analysis.

  • Data Diversity: Chinese sentiment analysis demands diverse and extensive datasets reflecting various dialects, regions, and cultural sentiments within the Chinese language (Peng et al. 2021; Guo et al. 2022; Álvarez-Carmona et al. 2022; Yang et al. 2020). Specialized models focus on accumulating and understanding this diverse dataset for accurate analysis. For example, a comprehensive Chinese restaurant review dataset was designed for Aspect Category Sentiment Analysis (ACSA) and rating prediction (RP) (Li et al. 2023a). It contains 46,730 genuine reviews, meticulously annotated with sentiment polarities across 18 fine-grained aspect categories and 5-star ratings, making it one of the largest and most detailed Chinese sentiment datasets. While Ciron (Xiang et al. 2020) stands as the first Chinese benchmark for irony detection, featuring 8.7 K microblog posts annotated for irony by native speakers, facilitating the exploration of irony detection in Chinese using machine learning models. Additionally, Chinese EmoBank (Lee et al. 2022) offers resources for dimensional sentiment analysis, with valence-arousal ratings for 5512 words, 2998 multi-word phrases, 2582 single sentences, and 2969 multi-sentence texts, supporting the development of fine-grained sentiment models for the Chinese language.

  • Local Applications: Local applications and platforms in China often require sentiment analysis tailored to the Chinese language (Yang et al. 2020). For instance, the phrase “贼好” (literally “thief good”) is a very positive comment often used in Northern China, while “侬好” (meaning “Hello”) is a greeting in the Shanghai dialect. These regional variations necessitate specialized models that can accurately interpret sentiments across different Chinese dialects and regions. Such applications might demand models that are specifically trained and optimized for local user interactions and content. For example, social media platforms like Weibo and WeChat have unique user interactions and slangs that are specific to the Chinese context. Models trained specifically on content from these platforms can better understand and analyze sentiments expressed by local users, enhancing the relevance and accuracy of the analysis.

  • Performance Enhancement: While ChatGPT and similar LLMs are powerful, dedicated sentiment analysis models for Chinese can be fine-tuned and optimized specifically for sentiment-related tasks, potentially providing better performance and accuracy for sentiment analysis tasks in the Chinese language context. For example, a model trained exclusively on Chinese customer reviews from e-commerce platforms like Taobao and JD.com can provide better performance in identifying customer satisfaction and feedback trends than a generalized model. Therefore, fine-tuning and optimization of models for different platforms and application areas is particularly important.

While generalized language models like ChatGPT offer significant capabilities, dedicated Chinese sentiment analysis models are essential to address language nuances, cultural specificity, domain requirements, data diversity, local applications, and performance enhancement within the Chinese language sentiment analysis domain.

In addition, China’s rise as a global economic leader and its role in initiatives like the Belt and Road highlight its ascent (Tombari 2021; Zou et al. 2022). Positioned to become the world’s largest economy, China’s strategic initiatives, especially the Belt and Road, showcase its influence, shaping global trade, connectivity, and geopolitics significantly.

3 Detailed survey on Chinese sentiment analysis

This section offers an in-depth survey on Chinese sentiment analysis including Data Acquisition of Scientific Publications, Subjects, Methods, Trends, and Hot Topics within the realm of sentiment analysis of Chinese text. It also highlights the progression of methodologies in multilingual sentiment analysis, with a specific emphasis on the Chinese language.

3.1 Data acquisition of scientific publications

In alignment with previous research methodologies (Cui et al. 2023), this study’s data collection involved sourcing research publications from the Web of Science across four primary databases: Conference Proceedings Citation Index—Social Sciences & Humanities (CPCI-SSH), Conference Proceedings Citation Index—Science (CPCI-S), Science Citation Index Expanded (SCI-Expanded), and Social Sciences Citation Index (SSCI). To establish a comprehensive dataset, specific keywords including “sentiment analysis,” “sentiment mining,” and “sentiment classification” were employed. This targeted keyword-based approach facilitated the identification and extraction of pertinent papers that constituted our data pool. After screening, a total of 711 literature pieces relevant to “Chinese” or the “Chinese language” were identified.

The search spanned from January 2002 to December 2022, encompassing publication types such as “article” and “conference paper”. Following the screening process, a total of 711 papers were identified. During this time period, the earliest related research appeared in 2005. The distribution of the publications from 2005 to 2022 is shown in Fig. 1.

Fig. 1
figure 1

Number of publications related to sentiment analysis in Chinese from 2005 to 2022. This data collection in this research involved sourcing research publications from the Web of Science (see details in Sect. 3.1) and the search range was set to include publications from a 20-year period between 2002 and 2022. However, since the earliest relevant literature within this range appears in 2005, the results shown in the figures all start from 2005 rather than 2002

Figure 1 illustrates the number of publications in the realm of Chinese sentiment analysis spanning the years 2005 to 2022. Evidently, the publication count follows an exponential growth trajectory over time. The data demonstrates a notable peak in 2016, followed by a minor decline and a subsequent peak in 2018. There is a slight decrease in 2019, after which the count experiences another exponential surge, culminating in the highest point by 2022.

Keywords were extracted from all the collected publications using the same method as previous research (Cui et al. 2023), in which KeyBERT was used for keyword extraction. Extracted keywords were compared and merged with the original keywords in the publications. Subsequently, merging and screening were conducted, excluding terms like “sentiment analysis,” “sentiment classification,” “Chinese language,” and “Chinese sentiment analysis”, which are used as keywords for publication collection. A total of 4062 keywords were selected with a word frequency of 6172. Upon observation, keywords with a frequency below 3 were mostly devoid of substantive meaning. Therefore, 258 keywords with a frequency greater than 2 were chosen, totaling 2025 in frequency, accounting for roughly one-third of the total frequency count. Keywords with a frequency of 10 or more are listed in the Table 1.

Table 1 Keywords with a frequency of 10 or more

High-frequency keywords often serve as indicators of research focal points. For example, “microblog” is a high-frequency keyword, and it indicates that “microblog” analysis is one important focus for Chinese sentiment analysis. “Deep learning” is also a high-frequency keyword, and it indicates that the “deep learning” method is one important focus in Chinese sentiment analysis. Such discoveries are reasonable, and consistent with previous research (Jia 2022; Wu et al. 2022). We have extracted these prominent keywords to form the foundation for our subsequent analysis.

3.2 Subject analysis

Table 1 showcases the high-frequency keywords which essentially encapsulate the primary research focus within the realm of Chinese sentiment analysis. “Microblog” and “deep learning” secures the top positions, trailed by “social medium,” and “text mining”. These high-frequency keywords encompass a wide range, including the study’s subject, content, and employed techniques and methods. With these keywords as a foundation, we established a keyword co-occurrence network, effectively visualizing the interplay between research methodologies and subjects.

Using the same drawing approach as described in Cui et al. (2023), we utilized Pajek software to create a visualization of the keyword co-occurrence network. Pajek’s Louvain community detection algorithm aids in partitioning the keyword co-occurrence network into sub-communities, each representing different sub-fields within keyword analysis (Cui et al. 2023; Blondel et al. 2008; Leydesdorff et al. 2014; Rotta and Noack 2011). The characteristics of this keyword co-occurrence network are elaborated upon in Fig. 2. The keyword co-occurrence network characteristics within the six sub-communities are described in Table 2. The count of nodes within each community reflects the quantity of keywords represented, while the connections, depicted as links, denote the correlations established among these keywords.

Fig. 2
figure 2

Keyword community network obtained using all the keywords related to sentiment analysis in Chinese text from 2005 to 2022

Table 2 Network-wide attributes of sub-communities on a global scale

Figure 2 shows the keyword community network obtained using all the keywords related to sentiment analysis in Chinese text from 2005 to 2022 (Due to the limited size of the image, not all keywords can be fully displayed here). C1 community centers on keywords related to Chinese sentiment in the context of “microblog” analysis, like “sentiment lexicon”, “sentiment dictionary”, “polarity classification” and so on. C2 and C3 communities delve into techniques for sentiment analysis, with C2 community primarily exploring the application of deep learning in NLP, and C3 community focusing predominantly on machine learning approaches and traditional sentiment analysis methods, such as Naive Bayes method, rule-based methods, etc. C4 and C5 communities revolve around specific themes within sentiment analysis. C4 community encompasses subjects linked to “weibo,” “COVID-19,” and “social medium,” while C5 community encompasses topics concerning “user reviews” and “opinions.” C6 community encompasses keywords pertinent to various intricacies within the sentiment analysis process, encompassing terms like “sentiment score,” “dimensional sentiment,” “Word2vec,” and “word segmentation.” Among them, Dimensional sentiment analysis is an important research topic, with existing studies primarily focusing on various analytical methods. The methods used include position-based character-enhanced word embedding and FastText as word vectors (Du and Zhang 2016), the Community-Based Weighted Graph Model (Wang et al. 2016a), a semi-supervised approach for DSA based on the variational autoencoder model (Wu et al. 2019), a tree-structured regional CNN-LSTM model (Wang et al. 2020c), the BiLSTM model (Cheng et al. 2021), and the BERT model (Deng et al. 2023), etc.

For evaluating the effectiveness of these methods, F1-score (Li et al. 2023a; Xiang et al. 2020; Hercigt et al. 2016; Bu et al. 2021; Zhao et al. 2023), Accuracy (Xiang et al. 2020; Hercigt et al. 2016; Bu et al. 2021), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (r) (Lee et al. 2022; Xie et al. 2021) are commonly used.

An intriguing discovery emerges from the comparison with the original keyword “Twitter” in the survey for sentiment analysis of English text (Cui et al. 2023; Wang and Parth 2016; Chin and Wang 2016): “Twitter” is replaced by “microblog”. This substitution appears both logical and congruent with prior research, given that “microblog” predominantly pertains to the Chinese context, whereas “Twitter” is primarily associated with English discourse. This adjustment maintains consistency with previous scholarly undertakings.

Table 2 presents the quantity of nodes denoting keywords within individual communities, along with the corresponding number of interconnecting links. It is observed that there are 326 connections within C2 (C2–C2 connections) and 102 connections between C2 and C3 (C2–C3 connections). Thus, the number of connections within C2 exceeds the number of connections between C2 and C3. The inter-community linkages highlight robust correlations existing among them. Notably, sub-communities C1, C3, C4, C5, and C6 exhibit significant associations with the C2 community, underscoring a collective emphasis on deep learning methodologies within this research domain. The C2 community has a strong correlation with the C5 community, reflecting the wide application of deep learning methods in the fields of user review and opinion mining. This is consistent with our findings in Cui et al. (2023), indicating that researchers are paying more attention to sentiment analysis technology, especially in deep learning methods. The article will analyze the application of sentiment analysis methods in Chinese texts in the next section.

3.3 Method analysis

Table 3 shows the four main methods of Chinese multilingual sentiment analysis. Categorizing the methods in this way helps provide a high-level understanding of the types of approaches available for sentiment analysis and how they relate to each other. It is worth noting that the boundaries between these categories can sometimes be blurred, as certain methods might incorporate aspects from more than one category.

Table 3 The main categories of the methods, and their characteristics, merits and disadvantages

3.3.1 Lexicon-based approaches

In the context of Chinese sentiment analysis, lexicon-based approaches face unique challenges due to the language’s complex structure. Unlike alphabetic languages, Chinese characters often carry multiple meanings depending on the context, and the language lacks spaces between words, making tokenization and segmentation crucial (Xu et al. 2010; Gu et al. 2022). As shown in Table 3, lexicon-based approaches are rooted in the analysis of individual words and their associated sentiment scores (Liu et al. 2009; Xu et al. 2010; Qiang et al. 2005; Du and Tan 2010; Zhang et al. 2008a; Lu et al. 2010; Gravano 2010). The lexicon-based approaches include using lexicons and rules for sentiment analysis. These methods rely heavily on Chinese sentiment lexicons specifically tailored to capture the nuances of Chinese expressions and leverage sentiment lexicons, which are lists of words with predefined sentiment values, to assign sentiment scores to text (Wan 2011; Wang et al. 2016b). The lexicons often require extensive customization and expansion to handle the diverse and context-dependent meanings of Chinese words, especially in Chinese microblogs where slang and abbreviations are prevalent.

With the help of lexicons, researchers can analyze the part of speech and emotional tendency of words in the sentences. As can be seen from Fig. 2, keywords such as “sentiment lexicon,” “sentiment dictionary,” and “emotional dictionary” in the C1 community have a strong connection with “microblog”. Because of the irregular expressions on the microblog, the basic lexicons cannot contain all the meaning or the part of speech of words (Wu et al. 2016; Jiang et al. 2013). Many Out-Of-Vocabulary (OOV) words are commonly used in Chinese microblogs, so constructing a new lexicon can help identify phrases (n-grams) more accurately and reduce semantic ambiguity. This application is critical in handling the rich emotional expressions found in Chinese social media. Some researchers will expand based on dictionaries built by professional organizations (Xu et al. 2010; Zhang et al. 2018, 2012; Xianghua et al. 2013), while others will build a new dictionary based on the language characteristics of user comments or opinions in specific scenarios (Wu et al. 2016; Yang et al. 2013).

Rule-based approaches, on the other hand, rely on predefined linguistic patterns to determine sentiment in Chinese. These approaches often involve sentence dependency analysis, emotional word recognition, and other feature extraction techniques, which are based on predefined sentiment values or rule templates (Liu et al. 2009, 2012, 2014; Wang et al. 2016c). In Chinese sentiment analysis, the effectiveness of rule-based methods can be particularly strong when the rules account for the language’s unique characteristics, such as its rich use of idioms, homophones, and the absence of word boundaries (Liao et al. 2022; Bhaskaran et al. 2022; Sun et al. 2022). If this method can exhaust the rule template as much as possible, it can achieve good sentiment analysis results. However, the limitations of rule-based methods become apparent when dealing with irregular or large-scale corpora in Chinese, especially given the diversity and complexity of Chinese expressions. The challenge is exacerbated by the frequent use of colloquialisms and new terms on Chinese social media platforms like Chinese Weibo, which are not always captured by traditional rule templates. As shown in Fig. 2, the “rule-based” keyword appears in the C3 community. This reflects the close relationship between rule-based methods and machine learning approaches in Chinese sentiment analysis. To overcome the limitations inherent in rule-based methods, these techniques are often combined with machine learning or deep learning methods to enhance the accuracy and robustness of Chinese sentiment analysis (Wen and Li 2014; Yan et al. 2018).

3.3.2 Traditional machine learning algorithms

As more and more users or consumers express their opinions and experiences about products or services online, a large number of comments or opinion texts are in urgent need of sentiment analysis and mining (Zhang et al. 2011; Wang et al. 2014b). While lexicon-based methods have traditionally been used, they struggle to handle the sheer volume and diversity of online content, especially in the context of big data. Similarly, rule-based methods often fall short in accuracy when applied to large-scale Chinese text corpora, which are characterized by linguistic complexity and the frequent emergence of new terms (Liu et al. 2009; Xu et al. 2010; Qiang et al. 2005).

To address these challenges, traditional machine learning algorithms have been increasingly employed in Chinese sentiment analysis. These algorithms, such as Support Vector Machines, Naive Bayes, Decision Trees, and Random Forest, excel at capturing patterns in data without requiring extensive computational resources (Wang et al. 2014b). By using features derived from Chinese text data, these models can effectively distinguish between different sentiment classes, making them well-suited for processing the vast amounts of textual data generated in the Chinese language (Li and Sun 2007; Zhang et al. 2008b; Zheng and Ye 2009).

Researchers have explored various text features to enhance the accuracy of sentiment analysis. Commonly used features include TF-IDF (Zhang and Zheng 2016; Zheng et al. 2018; Li and Sun 2007), continuous bag-of-words and continuous skip-gram models (Bai and Yu 2016), n-grams (Zhang et al. 2008b; Zheng et al. 2018), Word2Vec (Su et al. 2014; Zhang et al. 2015), DF value (which indicates the frequency of features across reviews) (Zheng et al. 2018), as well as syntactic and semantic features (Yang and Yu 2013). Additionally, features that reflect the unique characteristics of Chinese text, such as those with category-distinguishing ability and sentiment orientation, have been identified for improving model performance (Wang et al. 2007).

In Fig. 2, keywords related to text features (e.g., “feature selection,” “TF-IDF,” “information extraction”) and those related to rules (e.g., “rule-based”) are both prominent in the C3 community. This grouping highlights the integration of algorithmic rules with statistical analysis to enhance Chinese sentiment prediction. However, the focus that remains predominantly on traditional machine learning methods have proven particularly effective for Chinese sentiment analysis (Sun et al. 2022; Yang and Yu 2013; Su et al. 2014; Zhang et al. 2015).

3.3.3 Deep learning models

With the escalating prominence of Web 2.0, microblogging has become integral in our daily lives. Platforms like Chinese Weibo facilitate users to freely share reviews and opinions on diverse subjects, including products and events. Leveraging this public opinion data offers valuable insights into societal perspectives. This resource is particularly beneficial for businesses seeking to comprehend user sentiments toward their products, thereby aiding in the development of enhanced offerings (Li et al. 2016).

While traditional machine learning models have showcased proficiency in Chinese sentiment analysis, their effectiveness dwindles when processing vast volumes of data. This decline in performance often stems from their heavy reliance on feature extraction methods, which struggle with the brevity of Chinese microblog texts and the continuous emergence of new words in the Chinese language (Xu et al. 2019b). The brevity of Chinese microblog texts restricts feature extraction possibilities. Furthermore, the continuous emergence of new words and extensive network data complicate the extraction and training of text features for traditional machine learning approaches (Sun et al. 2016).

Deep learning methods present a compelling solution by bypassing manual feature extraction. These models utilize neural networks to autonomously discern intricate patterns and relationships within text data (Zhao et al. 2023). Their capacity to automatically learn complex patterns alleviates the burden of manual annotation, thereby reducing overall costs associated with sentiment analysis tasks.

As shown in C2 community of Fig. 2, evident is the prevalence of deep learning methods surpassing traditional machine learning approaches in sentiment analysis. This C2 community stands out with its extensive network of nodes and connections, indicating the significance of deep learning in the sentiment analysis domain. Notably, keywords from other sub-communities exhibit strong ties to those within the prominent C2 community. Recent advancements have seen a surge in the adoption of deep learning models for sentiment analysis. Researchers have employed diverse models, including but not limited to:

  • Long short-term memory (LSTM) (Day and Lin 2017),

  • Tree-LSTM (Tai et al. 2015),

  • Bidirectional long short-term memory (Bi-LSTM) (Wang et al. 2016d),

  • Recursive neural deep model (RNDM) (Li et al. 2014),

  • Convolutional neural networks (CNN) (Yanmei and Yuda 2015; Li et al. 2016; Xu et al. 2019b),

  • Deep neural network (DNN) stacked with Restricted Boltzmann machine (RBM) layers (Sun et al. 2016),

  • Bidirectional encoder representations from transformers (BERT) (Wang et al. 2020d),

  • Gated alternate neural network (GANN) (Liu and Shen 2020),

  • Innovative models derived from popular architectures (Peng et al. 2018; Yang et al. 2021; Feng and Cheng 2021).

Notably, the graph convolutional network (GCN) has gained prominence in recent years due to its ability to encode both graph structures and node features (Bruna et al. 2014). Its widespread adoption spans various domains within natural language processing, including sentiment analysis (Zhao et al. 2022; Liu et al. 2022). Categorizing these models underlines the significance of neural networks in capturing contextual information crucial for understanding subtle sentiment nuances.

3.3.4 Hybrid approaches

Hybrid methodologies in Chinese sentiment analysis integrate multiple techniques to enhance accuracy and robustness, addressing the specific linguistic and cultural complexities of the Chinese language. Two primary hybrid approaches have emerged in current research.

One approach combines lexicon-based methods with traditional machine learning or deep learning models, particularly to tackle the challenges posed by Chinese text, such as character ambiguity and the lack of word boundaries. For instance:

  • Dictionary-based hybrid methods: Researchers like Xu et al. extended Chinese sentiment lexicons, integrating them with Naive Bayes (NB) (Wang et al. 2016b), while Wang and Jiang selected Chinese informative words combined with Support Vector Machine (SVM) models to improve sentiment classification accuracy (Jiang et al. 2013; Wang et al. 2007).

  • Leveraging LDA and HowNet: Fu et al. used Latent Dirichlet Allocation (LDA) for topic identification, employing HowNet for sentiment polarity classification (Xianghua et al. 2013). Similarly, Day et al. extracted multi-feature words from Chinese sentiment resources, like NTUSD, HowNet, and iSGoPaSD, combining them with Bi-LSTM models to enhance sentiment prediction (Day and Lin 2017).

  • Emotional Dictionary with Neural Networks: Yang et al. merged Chinese emotional dictionaries with CNN and attention-based Bidirectional Gated Recurrent Unit (BiGRU) models to develop the SLCABG model, which effectively captures the intricacies of Chinese emotional expressions (Yang et al. 2020). Ahmed et al. crafted Chinese domain-specific sentiment dictionaries, integrating them with LSTM models for aspect-level sentiment analysis (Ahmed et al. 2020).

  • Emotion Identification Based on Rules: Yan et al. identified emotional subjects using rule-based syntactic dependencies, integrating this with SVM models for automatic emotion analysis in Chinese social media (Yan et al. 2018).

  • The Ortony-Clore-Collins (OCC) Model Combining Rules and Convolutional Neural Network (CNN): Wu et al. proposed an OCC model and a CNN-based opinion summarization method. They tested their hybrid method using real-world Chinese microblog data and compared the accuracy of manual sentiment annotation with the accuracy achieved using the OCC-based sentiment classification rule library. Their study highlights the potential of the proposed hybrid method for sentiment analysis of Chinese social media data (Wu et al. 2020).

These examples demonstrate that lexicon-based approaches remain vital in Chinese sentiment analysis, especially when combined with machine learning techniques to form hybrid approaches to address the specific challenges of the Chinese language.

The other approach involves merging diverse machine learning models, leading to innovative methods for Chinese sentiment analysis. For instance:

  • BT-CNN-ATT Model: Jia developed the BT-CNN-ATT model, amalgamating BERT, CNN, and attention mechanisms. This model not only extracts global features from Chinese Weibo contexts but also captures local features, such as words, to effectively mine emotional information (Jia 2022).

  • Bi-LSTM with Multi-head Attention (MHAT): Long et al. explored sentiment analysis of Chinese social media texts by integrating Bi-LSTM networks with the Multi-head Attention (MHAT) mechanism. This fusion aims to address the limitations of traditional machine learning approaches (Long et al. 2019).

  • Combining CNN with LSTM or Bi-LSTM: Several studies have merged CNN with LSTM (Zhang et al. 2022) or Bi-LSTM (Gan et al. 2021) models, aiming to enhance the effectiveness of Chinese sentiment analysis.

This categorization framework provides a structured understanding of the various approaches used in sentiment analysis. It emphasizes the distinctive traits of each category, elucidating the differences in their principles and methodologies. The above method classification scheme provides a structured framework for understanding the various methods used in sentiment analysis, highlights the unique characteristics of each category, and helps to clarify how these methods differ in their underlying principles and methodologies.

3.4 Trend analysis based on the number of keywords

Sections 3.2 and 3.3 delve into the distinct facets of Chinese sentiment analysis research: Subjects and methods. The analysis involves tracking annual shifts in keyword frequency, serving as a reflection of the evolutionary trajectory of research methods and topics within this field (Cui et al. 2023). This section leverages the keyword community network (referenced in Fig. 2) together with the analysis achievement from Sects. 3.2 and 3.3 to quantify the annual word frequency across sub-communities. By discerning fluctuations in the number of keywords over time, these sections aim to elucidate the evolving research trends within the realm of Chinese sentiment analysis. Furthermore, the visual representation of the keyword community’s evolution is presented in Fig. 3, serving as a graphical aid to complement and illustrate the changes in research trends over the years.

Fig. 3
figure 3

Evolution diagram of keywords obtained using all the keywords related to sentiment analysis in Chinese text from 2005 to 2022. Communities C1–C6 in figure correspond to different community categories within the keyword co-occurrence network. Figure illustrates the changes in the number of keywords within each community over the years. By examining figure, one can discern whether the research themes represented by each community are gaining attention or gradually declining in popularity, thus revealing trends in research topics and subjects for Chinese sentiment analysis

To elucidate the evolutionary patterns of keywords within each community over the years, we meticulously curated high-frequency keywords representative of each category. The fluctuations in word frequency for these selected keywords across different years were carefully plotted and are vividly presented in Figs. 4 and 5. These figures illustrate the changes in the number of high-frequency keywords within each category, reflecting how certain keywords, such as “deep learning” and “attention mechanism,” have gradually gained attention from researchers over time. Conversely, interest in some keywords, such as “SVM” and “hotel review,” has decreased. Additionally, some keywords experienced a temporary surge in attention due to emerging events, such as “COVID-19.” The variations in keyword counts shown in the figures, combined with textual analysis, highlight the trends in evolving topics and methods in the field of Chinese sentiment analysis.

Fig. 4
figure 4

Keyword evolution diagram of C1, C4, and C5 communities. Figure shows the high-frequency keywords for communities C1, C4, and C5, displaying the frequency counts of these keywords over the years.This visual representation allows for a clear observation of the evolving trends of high-frequency keywords within each community. Figure also reflects shifts in research focus and methodologies, providing deep insights into the dynamic changes within the field by highlighting the rise of emerging trends and the decline of older topics. For instance, it can be observed that sudden events, such as “COVID-19,” triggered a temporary surge in attention to related keywords starting from 2019 in the context of Chinese sentiment analysis

Fig. 5
figure 5

Keyword evolution diagram of C2, C3, and C6 communities. Similarly, figure shows the high-frequency keywords for communities C2, C3, and C6, along with their frequency counts over the years. This visual representation allows for a clear observation of the evolving trends of high-frequency keywords within each community. For instance, the keyword “deep learning” has gradually gained more attention since 2015, whereas it was not a prominent topic before that year

Combining the data shown in Figs. 3, 4 and 5, we can see that the earliest research mainly started with keywords in the C5 community. Keywords in the C5 community mainly involve “opinion mining,” “user review,” “Chinese review,” etc. The number of keywords has continued to grow over the years, reflecting researchers’ focus on mining user opinions and comments. The keywords of the C1 community predominantly revolve around sentiments and emotions related to “microblog,” “sentiment lexicon,” and “sentiment dictionary”. The frequency of keywords has been growing rapidly before 2016, and reached a peak in 2016. It was mainly affected by the “microblog” keyword, and also showed the importance of sentiment-related dictionaries in sentiment analysis. After 2016, the number of keywords exhibited a declining trend, indicating that scholars’ research inclination has shifted towards exploring technology (C2) and specific trending topics (C4). Keywords in the C2 and C3 communities are mainly related to sentiment analysis technology (see Fig. 5). The keywords of the C2 community mainly involve words related to deep learning methods such as “Deep learning”, “Natural language Processing”, “word embedding”, “convolutional neural network”, “LSTM” and so on. Since 2014, the frequency of keywords has increased significantly, highlighting researchers’ high attention on deep learning methods in the field of sentiment analysis.

In addition, with the advancement of technology, the granularity of sentiment analysis has gradually changed from document-level and sentence-level to more fine-grained aspect-level (Peng et al. 2018; Fu et al. 2017; Wu et al. 2012; Zhang et al. 2014; Zhang and Lan 2015; Wang et al. 2017; Chen et al. 2022). Document-level sentiment analysis typically provides a general sentiment score for an entire document, while sentence-level analysis focuses on using individual sentences to determine sentiment. Wu et al. (2012) address the challenges of analyzing sentiment in comparative sentences within Chinese documents, proposing a method that enhances document-level sentiment analysis. Zhang et al. (2014) question the reliability of star ratings in capturing review sentiment and present a framework that improves phrase-level sentiment labeling through unsupervised review-level classification. Zhang and Lan (2015) present an innovative approach to enhancing sentiment analysis at both the word and sentence levels by incorporating sentiment information directly into word embeddings. Fu et al. (2017) enhance sentence-level sentiment analysis by integrating the HowNet lexicon with a Phrase Recursive Autoencoder.

However, as the demand for more precise insights has grown, aspect-level sentiment analysis has emerged as a powerful tool. This approach involves dissecting text into various aspects or features, allowing for a nuanced understanding of sentiment toward specific elements (Peng et al. 2018; Wang et al. 2017; Chen et al. 2022). For instance, in the context of product reviews, aspect-level sentiment analysis can identify and analyze sentiments related to different facets such as the product’s design, functionality, durability, and customer support. Peng et al. (2018) introduce the Aspect Target Sequence Model for Chinese aspect-based sentiment analysis, capturing multi-grained text representations and improving sequential information analysis. Chen et al. (2022) develop a language-agnostic, discrete latent opinion tree model for aspect-based sentiment analysis, offering improved interpretability and robustness in low-resource scenarios. Collectively, these studies underscore the ongoing advancement of sentiment analysis in NLP, particularly within the Chinese language.

Moreover, the advancement of aspect-level sentiment analysis is supported by sophisticated techniques such as dependency parsing, attention mechanisms in neural networks, and transfer learning. These innovations enable systems to better capture contextual relationships and subtle variations in sentiment. As a result, the analysis becomes more accurate and actionable, offering a detailed understanding of customer opinions and enhancing decision-making processes across various domains. Researchers are actively tackling challenges such as complex sentence structures, multi-word aspect targets, and domain-specific nuances. The shift towards finer-grained, aspect-level analysis highlights the growing need for nuanced sentiment understanding in various applications, driving the evolution of sentiment analysis research and enhancing its practical impact.

From Fig. 5, we can see that the word frequency of “aspect-based” keywords reached the highest in 2022. The C3 community mainly includes keywords related to machine learning and text features such as “machine learning,” “SVM,” “NB,” and “feature selection”. The frequency of keywords in the C3 community had been increasing before 2018. However, after 2018, the number of keywords showed a downward trend, reflecting the fact that those researchers paid more attention to the application of deep learning methods in sentiment analysis in the C2 community. The keywords of the C4 community mainly involve social media topics related to “Weibo,” “social medium,” “topic model,” and “COVID-19”. The number of keywords experienced two significant increases between 2015 and 2020, indicating that researchers are increasingly interested in social media research, especially during COVID-19, with special attention paid to users’ emotional expressions on social media during the epidemic (Gu et al. 2022; Wang et al. 2020d; Yang et al. 2022a; Gao et al. 2021). The C6 community contains keywords related to various complexities in the sentiment analysis process. C6 reached its peak in 2016 in terms of the number of keywords. This is attributed to topics such as “multi-dimensional sentiment analysis,” “word segmentation,” “affective lexicon,” “Word2vec,” and “affective computing.”

From the perspective of trend analysis, the sentiment analysis of user opinion mining and comments has always been the focus of research. However, in recent years, research topics have gradually tended to explore hot topics on social media platforms, especially Weibo, China’s largest social media platform, where the most discussed topic is COVID-19. This is similar to the results obtained from a review of sentiment analysis in English literature, indicating that the world has paid attention to sentiment analysis research on social media platforms in recent years, especially on the topic of COVID-19 (Cui et al. 2023). In terms of sentiment analysis technology, the focus of research has gradually shifted from machine learning to deep learning, and hybrid methods that can mutually compensate for the shortcomings of the model have also been receiving attention. The conclusion is also similar to our research on English sentiment analysis. However, we found that in Chinese sentiment analysis research, sentiment lexicon and sentiment dictionary have received greater attention than English sentiment analysis research. In addition, the graph convolutional network is gradually applied to the field of sentiment analysis (Zhao et al. 2022; Liu et al. 2022; Wang et al. 2022; Yang et al. 2022b). It reflects the complexity of Chinese text language sentence patterns, word meanings, etc. Integrating lexicons and knowledge graphs into sentiment analysis research can help improve the accuracy of sentiment analysis.

3.5 Chinese-language datasets and evaluations

Chinese sentiment analysis has made notable strides, fueled by the development of high-quality datasets, various dimensional sentiment analysis methods, and performance evaluations. This subsection provides a concise overview of key Chinese-language datasets, different dimensional sentiment analysis methods supported by these corpora, and their performance evaluations.

3.5.1 Chinese sentiment datasets

As shown in Fig. 2, among the high-frequency keywords, several of them are related to corpora, such as “corpus” in community C1, “corpus Chinese” in community C5, and “labeled corpus” in community C6. This reflects the fact that Chinese sentiment corpora is a key focus for Chinese sentiment analysis. Numerous Chinese sentiment corpora have been established, including the NLPCC 2023 Task 4 dataset (DiaASQ dataset, which includes both Chinese and English languages) (Li et al. 2023a), the SemEval-2016 Task 5 dataset (Hercigt et al. 2016), a large-scale Chinese restaurant review dataset (Bu et al. 2021), the Chinese EmoBank (Lee et al. 2022), and a Chinese three-dimensional corpus with valence-arousal-irony (VAI) ratings (Xie et al. 2021). Different research scenarios, such as sarcasm detection—a challenging task involving non-literal expressions—also require specialized corpora such as the Chinese Irony Corpus for support (Xiang et al. 2020; Tang and Chen 2014). Other corpora, such as HowNet (Xianghua et al. 2013; Fu et al. 2017) and NTUSD (Day and Lin 2017) are Chinese sentiment lexicons used to support lexicon-based or hybrid methods.

Chinese sentiment corpora are foundational to the study of sentiment analysis in Chinese, providing essential datasets for training and evaluating different methods. These corpora encompass various aspects of sentiment analysis, including Aspect-Based Sentiment Analysis (ABSA) (Li et al. 2023a), irony detection (Xiang et al. 2020; Tang and Chen 2014), and fine-grained sentiment analysis (Zhou et al. 2022; Xu et al. 2010; Gu et al. 2022), among others, each addressing different facets of how sentiment is expressed in the Chinese language (Zhao et al. 2023; Xu et al. 2016). For instance, a comprehensive Chinese restaurant review dataset was designed for Aspect Category Sentiment Analysis (ACSA) and rating prediction (RP) (Li et al. 2023a). It contains 46,730 genuine reviews, meticulously annotated with sentiment polarities across 18 fine-grained aspect categories and 5-star ratings, making it one of the largest and most detailed Chinese sentiment datasets. Ciron (Xiang et al. 2020) stands as the first Chinese benchmark for irony detection, featuring 8.7 K microblog posts annotated for irony by native speakers, facilitating the exploration of irony detection in Chinese using machine learning models. Additionally, Chinese EmoBank (Lee et al. 2022) provides resources for dimensional sentiment analysis, with valence-arousal ratings for 5512 words, 2998 multi-word phrases, 2582 single sentences, and 2969 multi-sentence texts, supporting the development of fine-grained sentiment models for the Chinese language. C-STANCE (Zhao et al. 2023) focuses on zero-shot stance detection (ZSSD), presenting a dataset of 48,126 annotated text-target pairs that span various domains and introduce complex target-based and domain-based ZSSD tasks. These Chinese sentiment corpora support different dimensional sentiment analysis in Chinese.

3.5.2 Dimensional sentiment analysis

Chinese sentiment datasets (also known as Chinese sentiment corpora) support different levels of dimensional sentiment analysis in Chinese, which is an advanced approach extending beyond traditional binary or categorical sentiment classification (such as positive, negative, or neutral) (Lee et al. 2022; Xie et al. 2021). Instead of merely classifying text as positive or negative, dimensional sentiment analysis assesses various dimensions, such as emotions (joy, sadness, anger), intensity, and specific aspects of sentiment related to different topics or entities within the text (Lee et al. 2022; Zhao et al. 2023). It also involves fine-grained sentiment analysis, which evaluates not only the presence of positive or negative sentiment but also the degree of sentiment (e.g., strongly positive, slightly positive, strongly negative, and slightly negative) (Zhou et al. 2022; Xu et al. 2010; Gu et al. 2022).

Dimensional sentiment analysis in Chinese has gained increasing attention as researchers aim to capture the nuanced emotions expressed in the Chinese language, which often carries cultural and linguistic subtleties that differ from other languages (Du et al. 2022; Liao et al. 2022; Sun et al. 2022; Guo et al. 2022; Yang et al. 2020; Xiang et al. 2020; Lee et al. 2022; Bu et al. 2021; Zhao et al. 2023). Chinese dimensional sentiment analysis is an important research topic, with existing studies primarily focusing on various analytical methods. These methods include position-based character-enhanced word embedding and FastText as word vectors (Du and Zhang 2016), the Community-Based Weighted Graph Model (Wang et al. 2016a), a semi-supervised approach for DSA based on the variational autoencoder model (Wu et al. 2019), a tree-structured regional CNN-LSTM model (Wang et al. 2020c), and deep learning-based methods (Cheng et al. 2021; Deng et al. 2023).

3.5.3 Performance evaluations

For evaluating the effectiveness of these Chinese sentiment corpora and methods, different evaluation metrics such as F1-score (Li et al. 2023a; Xiang et al. 2020; Hercigt et al. 2016; Bu et al. 2021; Zhao et al. 2023), accuracy (Xiang et al. 2020; Hercigt et al. 2016; Bu et al. 2021), mean absolute error (MAE), and Pearson correlation coefficient (r) (Lee et al. 2022; Xie et al. 2021) are commonly used. A review might be positive overall but have different sentiments toward various aspects like service quality, product features, or pricing (Lee et al. 2022; Xie et al. 2021). For examining different Chinese dimensional sentiment analysis, a sufficient size of Chinese corpus (e.g., Chinese sentiment dataset) is necessary. For instance, a Chinese EmoBank (Lee et al. 2022), and a Chinese three-dimensional corpus with valence-arousal-irony (VAI) ratings (Xie et al. 2021) provide the Chinese copus support for such dimensional sentiment analysis.

Evaluations on these datasets highlight the effectiveness of various models, including different dimensional sentiment analysis methods. In Aspect Category Sentiment Analysis (ACSA) tasks, BERT-based models have consistently outperformed traditional methods on datasets like ASAP and the SemEval-2016 RESTAURANT dataset in terms of accuracy (Hercigt et al. 2016), with further enhancements achieved through joint learning models that combine ACSA and rating prediction (RP) in terms of F1-score (Bu et al. 2021). For zero-shot stance detection (ZSSD) on C-STANCE, transformer-based models such as RoBERTa and XLNet have set new benchmarks, surpassing RNN-based models and underscoring the complexity of claim target prediction in terms of F1-score as well (Zhao et al. 2023). BERT has also demonstrated superior performance in irony detection tasks on the Ciron dataset (Xiang et al. 2020), as well as in predicting valence and arousal ratings across different text granularities in Chinese EmoBank in terms of Pearson correlation coefficient (Lee et al. 2022).

These evaluations underscore the significant advancements in Chinese sentiment analysis, driven by the availability of rich Chinese sentiment analysis datasets, different dimensional sentiment analysis methods, and the applications of such methods including powerful deep learning models.

4 Practical applications and future prospects of Chinese sentiment analysis

In Sect. 3, we analyze the themes, methods and trends of existing Chinese sentiment analysis. The practical applications, limitations, and future prospects of Chinese sentiment analysis hold significant implications for industries, communication, and technology.

Despite extensive exploration within existing studies, no publications have explored the use of ChatGPT for sentiment analysis when we conduct this survey research. While there might be such studies available, our search across prominent databases, including the Conference Proceedings Citation Index—Social Sciences & Humanities (CPCI-SSH), Conference Proceedings Citation Index—Science (CPCI-S), Science Citation Index Expanded (SCI-Expanded), and Social Sciences Citation Index (SSCI), yielded no publications related to ChatGPT in this context.

4.1 ChatGPT for Chinese sentiment analysis

Language models like ChatGPT can be categorized under the broader category of Deep Learning Models. These models, including transformer-based architectures like ChatGPT, belong to the deep learning category because they leverage neural network architectures with multiple layers to understand and generate human-like text. Deep learning models are known for their ability to capture complex patterns and relationships in data, making them highly suitable for tasks like sentiment analysis.

Among advanced language models, ChatGPT, powered by transformer architectures, has emerged as a versatile tool for sentiment analysis. In a recent breakthrough, GPT-4 showcased remarkable “theory of mind” abilities by effectively tackling 95% of false-belief tasks (Brunet-Gouet et al. 2023), This achievement hints at the exciting possibility of GPT-like models evolving to acquire cognitive empathy, potentially revolutionizing their role in understanding human emotions and intentions. ChatGPT is pre-trained on a diverse range of text sources, enabling it to capture linguistic variations, idiomatic expressions, and cultural nuances. Its capacity for transfer learning empowers it to adapt to different languages, including Chinese. ChatGPT’s capabilities in multilingual sentiment analysis include:

  • Contextual “Understanding”: ChatGPT’s ability to “understand” context aids sentiment analysis by capturing the meaning behind sentiment-bearing phrases. In a recent study encompassing affective computing tasks, ChatGPT excelled in sentiment analysis, underscoring its prowess in grasping emotional nuances in text (Amin et al. 2023). Polgan et al. underscored the pivotal role of ChatGPT in decoding sentiment and emotions, especially in the context of making informed decisions within the ever-evolving specific landscape (Polgan et al. 2023). With its proficient text analysis capabilities, ChatGPT serves as a valuable tool, equipping users with actionable insights to navigate and gain a deeper understanding of sentiments. Notably, ChatGPT’s ability to capture linguistic nuances and adapt to the Chinese sentiment analysis domain positions it as a promising asset for future applications in this field.

  • Cross-Lingual Adaptation: The transfer learning approach of ChatGPT enables it to generalize sentiment understanding from one language to another. Previously, Zhang et al. introduced an innovative approach known as hybrid-tuning, aimed at addressing the challenge of catastrophic forgetting (Zhang et al. 2023a). This method seamlessly merges both general and domain-specific knowledge, while also harmonizing the stages of pre-training and fine-tuning. The result is a system that excels in delivering precise and contextually fitting responses within the specific Chinese domain. It is worth noting that such a method holds significant promise for extending its applicability to emerging emotional tasks.

  • Handling Idiomatic Expressions: ChatGPT’s exposure to a wide array of linguistic contexts equips it with the ability to handle idiomatic expressions and nuanced sentiments. For idiom fill-in-the-blank and idiom understanding, Li et al. observed that ChatGPT which provides a straightforward chain-of-thought prompt can enhance performance in intricate reasoning (Li et al. 2023b).

Amin et al. evaluated the ability of ChatGPT on English affective computing problems, and the results showed that ChatGPT is a good generalist model that can achieve good results on a variety of problems without any professional training (Amin et al. 2023). A few studies have also confirmed the feasibility of ChatGPT in the field of Chinese question answering and event extraction (Shao et al. 2023; Zhu et al. 2023). The utilization of LLMs, such as ChatGPT, for sentiment analysis in a Chinese-centric context will be one of the future research directions and applications (Fu et al. 2024).

4.2 Global business insights

For multinational businesses, understanding sentiment in the Chinese language is invaluable. As the world’s most widely spoken language with over 1.4 billion native speakers, Chinese market advantage enables multinational businesses to effectively target Chinese consumers and scale their offerings to a massive population. China’s economic influence calls for precise sentiment analysis in a Chinese-centric context, aiding market strategies and decision-making. In a thorough evaluation using a dataset of 7165 financial questions, Ren et al. found that ChatGPT demonstrated higher levels of professionalism and accuracy compared to human services in Chinese financial conundrums, resulting in increased efficiency, cost savings, and improved customer satisfaction, ultimately enhancing the competitiveness and profitability of financial institutions (Ren et al. 2023). Cross-cultural diplomacy is enhanced through accurate sentiment analysis. Chinese-centric sentiment understanding fosters better cross-border communication, essential for diplomatic relations and cultural exchanges. The Chinese market’s magnitude makes sentiment analysis pivotal for e-commerce. Chinese-centric analysis unveils consumer preferences, leading to tailored marketing strategies that resonate with Chinese consumers.

4.3 Enhanced Chinese language models

Advancements in Chinese-specific pre-trained models will bolster sentiment analysis accuracy, addressing linguistic intricacies and cultural subtleties. Models like ChatGPT can further adapt to Chinese-specific linguistic nuances, catering to the dynamic expressions of sentiment across regions. Wang et al. exhibited that GPT-4’s proficiency is on par with Chinese participants who passed a specific exam, demonstrating its potential for discharge summarization, group learning, and strong verbal fluency in human–computer interactions (Wang et al. 2023c). ChatGPT significantly enhances various capabilities, particularly in Chinese-specific task comprehension, while also addressing issues like hallucinations, legal risks, and ethical concerns. This suggests that ChatGPT has the potential to serve as an essential component of the Chinese-centric sentiment analysis. Besides, creating more labeled data in Chinese is vital for robust sentiment analysis.

Hassani et al. explored the potential impact of ChatGPT on data science, presenting opportunities and challenges, that emphasized ChatGPT’s role in data augmentation, citing an example where ChatGPT was employed to generate synthetic radiology reports, enhancing the training data for a machine learning model in radiology report classification at the University of California, San Diego (Hassani and Silva 2023). This potentiality has spurred the application of ChatGPT as a means of data augmentation in various interdisciplinary fields. Van Nooten et al. utilized ChatGPT 3.5 to create authentic anti-vaccination tweets in Dutch, aiming to balance a skewed vaccine hesitancy classification dataset (Van Nooten and Daelemans 2023). By augmenting the gold standard data with these generated examples, the study demonstrated notable enhancements for underrepresented classes, overall recall improvement, and a slight decline in precision for more prevalent classes, while also assessing the synthetic data’s generalization to human-generated data in the classification task. Collaborative efforts in data augmentation can drive more accurate Chinese-centric models. Future research should prioritize ethical multilingual sentiment analysis, accounting for cultural sensitivity and fairness in sentiment interpretation.

Buscemi and Proverbio evaluate the performance of ChatGPT 3.5, ChatGPT 4, Gemini Pro, and LLaMA2 7b in multilingual sentiment analysis, including Mandarin Chinese (Buscemi and Proverbio 2024). Their findings indicate that while ChatGPT 4 and Gemini Pro generally exhibit a positive bias towards Mandarin, showing improvements over earlier versions, they still face notable inconsistencies and biases, particularly in handling nuanced texts such as irony and sarcasm. LLaMA2, on the other hand, shows minimal variation across languages but tends to be overly optimistic. Comparisons with human responses reveal that native Mandarin speakers provided more positive feedback, somewhat aligning with the models’ outputs. However, significant gaps remain, underscoring the need for further refinement to ensure cultural sensitivity and reliability in sentiment analysis across different languages.

Dong et al. conducted a detailed comparative analysis of various pre-trained language models (PLMs) for Chinese sentiment analysis in human–computer negotiation dialogues (Dong et al. 2024). They examined the advanced models such as GPT-3.5, BERT, Claude, ELECTRA, NEZHA, ERNIE 3.0, BART, and XLNet. The study assesses their effectiveness in sentiment detection using a large, diverse Chinese dataset with sentiment annotations. The findings highlight significant performance differences among the models, offering valuable insights for enhancing automated negotiation systems and improving sentiment analysis in negotiation scenarios.

Zhang et al. explored the effectiveness of three large language models (LLMs)—a baseline GPT-3.5, a Chinese language-specific Erlangshen-RoBERTa-110M-Sentiment, and a financial domain-specific Chinese FinBERT—in extracting sentiment from Chinese financial news to predict stock price movements (Zhang et al. 2023b). Using a standardized procedure, the study analyzes 394,426 news summaries, developing quantitative trading strategies based on the sentiment factors extracted by each model. Results show that the Erlangshen model, despite its smaller size, outperforms the others in annual return, risk-adjusted return, and excess return, highlighting the importance of language-specific pre-training for enhanced performance in sentiment analysis and quantitative trading in the Chinese market.

Ihnaini et al. investigated the application of advanced large language models (LLMs) for sentiment analysis of classical Chinese literature, focusing on Song Dynasty poetry (Song Ci) (Ihnaini et al. 2024). The study acknowledges the complexities of the linguistic structures and unique emotional expressions inherent in Song Ci, which pose significant challenges to traditional sentiment analysis methods. Utilizing fine-tuning techniques, the research navigates and interprets the nuanced language and emotional content of Song Ci with improved accuracy. The evaluation of the fine-tuned Chinese LLaMA 2 and Qwen models highlights their performance in detecting subtle emotional shifts within these poems. Specifically, the ChatGLM-7B (8-bit) model stands out, achieving an F1 Score of 0.840, showcasing its ability to merge ancient literary analysis with modern computational technology.

The fusion of Chinese-centric sentiment analysis and advanced language models has the potential to revolutionize cross-cultural communication and understanding. As we conclude this exploration, we reflect on the trajectory of sentiment analysis in a globalized world and envision a future where Chinese sentiment analysis paves the way for richer, more accurate cross-lingual emotional assessment.

5 Future directions

Based on the results of exploring the prospects of Chinese sentiment analysis discussed in Sects. 34, this section outlines several future directions that warrant further exploration.

5.1 Ethical multilingual sentiment analysis and understanding

Research in multilingual contexts introduces specific ethical challenges related to language dynamics, data handling, and presentation (Abdullah and Rusli 2021; Schembri and Jahić Jašić 2022). It is crucial to address these ethical considerations systematically. In the study by Janusch et al., an examination of interviews with Chinese teachers conducted initially in English and later in Chinese highlighted the impact of language dominance on interpretation. The researchers observed that the perspective brought by researchers from another cultural orientation and a position of power can influence interpretation (Janusch 2011). Switching to Chinese empowered participants, fostering more accurate and expressive communication, resulting in richer, more valid data and robust research outcomes. Holmes et al. recognized the widespread ethical concerns in multilingual research, they advocated for proactive ethical practices, emphasizing the importance of researcher reflexivity (Hassani and Silva 2023). Researchers are encouraged to challenge monolingual state and institutional practices. Therefore, efforts to minimize biases and promote cultural sensitivity in multilingual sentiment analysis will lead to more reliable and universally applicable models. Ethical considerations in model training and deployment are essential for fostering responsible AI development in the future.

5.2 Multilingual emotion recognition

Emotion recognition holds a pivotal role in enhancing human–computer interaction. While numerous studies have historically focused on speech emotion recognition utilizing various classifiers and feature extraction methods, the majority have primarily tackled this challenge within the confines of a single language. Numerous studies take a significant leap by expanding monolingual speech emotion recognition to encompass emotions expressed in multiple languages simultaneously, thereby constituting a comprehensive system (Lee 2019; Heracleous and Yoneyama 2019; Ma et al. 2024). This departure from the norm represents an exploration into uncharted territory, as the domain of multilingual emotion recognition extends beyond the well-established realm of sentiment analysis. Traditionally, emotions such as joy, sadness, anger, and fear have been explored in the context of single-language studies. However, the current research broadens the scope to comprehend and recognize these emotions across diverse languages (Zhang et al. 2024). This pioneering approach has the potential to revolutionize human–computer interactions by fostering a more inclusive and culturally sensitive understanding of emotional expressions. By acknowledging and responding to emotions expressed in various languages, this study strives to enhance the effectiveness and adaptability of computer systems, ultimately contributing to a more nuanced and responsive interaction between humans and machines.

5.3 Interpretable sentiment understanding

Chinese-centric sentiment analysis involves the identification, examination, quantification, and retrieval of implicit emotions and subject-related information. Its impact spans a wide range of domains, including assessing the mental health of individuals and detecting fraud in the financial sector (Jawale and Sawarkar 2020). As the volume of social media data continues to surge, there is an increasing demand for automated sentiment analysis. Deep learning, although offering high accuracy, often operates with an opaque decision-making strategy. To bolster decision-making integrity, trust, belief, fairness, reliability, and impartiality become paramount. It is essential to move beyond mere accuracy and address the interpretability of the models. Developing models that not only deliver accurate results but also offer clear explanations for their decisions is crucial for enhancing transparency and accountability in sentiment analysis. This approach ensures that the decision-making process is not only accurate but also comprehensible, fostering trust and confidence in the results generated by sentiment analysis models.

5.4 Fine-tuning and domain adaptation

Analyzing sentiment in Chinese-centric content is essential for extracting user sentiments related to various events or topics, be it in tweets or on Weibo (Zhao et al. 2015; Wan 2008). This enables a more nuanced understanding of user sentiments, providing a complementary perspective to sentiment analysis in other languages. While there has been considerable advancement in sentiment analysis technology, there is a noticeable gap in research focusing on Chinese-centric sentiments. To address this gap, future work could involve pretraining models on large-scale datasets specifically for object recognition, laying the groundwork for effective transfer learning. Customizing models for distinct domains and languages through fine-tuning and domain adaptation emerges as a crucial strategy to enhance performance (Zhai et al. 2024; Lossio-Ventura et al. 2024). This approach ensures that sentiment analysis models are attuned to the intricacies of the Chinese language and its unique cultural contexts, ultimately leading to more accurate and contextually relevant sentiment predictions. The exploration of such tailored approaches will contribute significantly to advancing the field of sentiment analysis in the Chinese language.

5.5 Multimodal understanding

In addition to Chinese-centric textual content, other modalities, such as images and videos, represent straightforward media through which individuals express emotions on social networking sites (Yang et al. 2024; Miah et al. 2024). Social media users are progressively turning to images and videos to articulate opinions and share experiences. Conducting sentiment analysis on this extensive visual content can significantly improve the extraction of user sentiments related to events or topics, creating a complementary aspect to textual sentiment analysis. While substantial advancements have been achieved in this technology, there remains a dearth of research focusing on multi-modal Chinese-centric sentiments. To address this gap, future research could delve into incorporating visual and contextual cues into sentiment analysis models. Recognizing the prevalence of visual content in expressing emotions, integrating these cues can enhance the models’ overall understanding of sentiment, leading to more nuanced and accurate predictions. Exploring the multi-modal aspects of sentiment analysis in the Chinese context is imperative to fully grasp the diverse ways in which users convey emotions across different mediums, ultimately advancing the capabilities of sentiment analysis technology in the realm of visual content.

6 Conclusion

In conclusion, this comprehensive survey on the progression of sentiment analysis in Chinese text unveils essential insights into the thematic trends, methodologies, and emerging patterns within this domain. The utilization of a unique framework combining keyword co-occurrence analysis and sophisticated community detection algorithms has illuminated the landscape of Chinese sentiment analysis research.

Throughout the past two decades, this study has traced the dynamic interplay between research methodologies and evolving topics, revealing correlations and shedding light on significant hotspots and trends in Chinese language text analysis. The comparative analysis presented here not only highlights correlations but also signifies evolving patterns, providing invaluable insights into the intricate terrain of sentiment analysis within the Chinese language context.

Our investigation of multilingual sentiment analysis highlights the amplification of challenges in sentiment interpretation due to language diversity. The Chinese language, with its intricate characters and cultural nuances, requires specialized techniques for accurate sentiment assessment. ChatGPT has emerged as a transformative technology, showcasing its adaptability to languages, including Chinese, and its ability to conduct sentiment analysis in context.

Moreover, beyond its academic contribution, this paper serves as a practical guide, offering insights into sentiment analysis methodologies and thematic trends. It lays the groundwork for future explorations, delineates technical limitations, and outlines promising directions for the advancement of sentiment analysis in Chinese text.

The roadmap provided by this study is poised to aid researchers, practitioners, and stakeholders navigating the complexities of sentiment analysis in the Chinese language. The identified trends and methodologies serve as a cornerstone for future investigations and advancements in this burgeoning field, fostering continued progress and innovation in sentiment analysis within the context of Chinese text.