A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Zhang, Shunxiang; Xu, Hanqing; Zhu, Guangli; Chen, Xiang; Li, KuanChing

doi:10.1007/s00500-021-06228-9

A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Application of soft computing
Published: 24 September 2021

Volume 26, pages 853–866, (2022)
Cite this article

Soft Computing Aims and scope Submit manuscript

511 Accesses
12 Citations
Explore all metrics

Abstract

New sentiment words in product reviews are valuable resources that are directly close to users. The data processing of new sentiment word extraction can provide information service better for users and provide theoretical support for the related research of edge computing. Traditional methods for extracting new sentiment words generally ignored the context and syntactic information, which leads to the low accuracy and recall rate in the process of extracting new sentiment words. To tackle the mentioned issue, we proposed a data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews. Firstly, the probability that the new word is a sentiment word is calculated through the location rules derived from the sequence labeling result, and the candidate set of new sentiment words is obtained according to the probability. Then, the candidate set of new sentiment words is supplemented with the method of matching appositive words based on edit distance. Finally, the final set of new sentiment words is collected through fine-grained filtering, including the calculation of point mutual information and difference coefficient of positive and negative corpus (DC-PNC). The experimental results illustrate the effectiveness of new sentiment words extracted by the proposed method which can obviously improve the accuracy and recall rate of sentiment analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sentiment-Bearing New Words Mining: Exploiting Emoticons and Latent Polarities

Extracting Sentiments by Using Fine-Grained Mining

Article 12 August 2021

Learning patterns for discovering domain-oriented opinion words

Article 14 June 2017

Data availability

Data cannot be made available for privacy reasons.

References

Basiri ME, Abdar M, Kabiri A, Nemati S, Zhou X, Allahbakhshi F (2020) Improving sentiment polarity detection through target identification. IEEE Trans Comput Social Syst 7(1):113–128
Article Google Scholar
Beigi OM, Moattar MH (2020) Automatic construction of domain-specific sentiment lexicon for unsupervised domain adaptation and sentiment classification. Knowledge-Based Syst 213:106423
Article Google Scholar
Bi J, Liu Y, Fan Z (2019) Representing sentiment analysis results of online reviews using interval type-2 fuzzy numbers and its application to product ranking. Inf Sci 504:293–307
Article Google Scholar
Chen Z, Liu X, Yin Y, Lu H (2020) Named entity recognition method for fault knowledge based on deep learning. In: Proceedings of the 4th international conference on machine learning and soft computing (ICMLSC 2020), Haiphong City, Viet Nam, ACM, January 17–19, 2020, pp.1–4
Darwich M, Noah SAM, Omar N (2020) Deriving the sentiment polarity of term senses using dual-step context-aware in-gloss matching. Inf Process Manag 57(6):102273
Article Google Scholar
Deng D, Jing L, Yu J, Sun S, Michael K. Ng. (2019) Sentiment Lexicon construction with hierarchical supervision topic model. IEEE/ACM Trans Audio Speech Lang Process 27(4):704–718
Article Google Scholar
Deng D, Jing L, Yu J, Sun S (2019) Sparse Self-Attention LSTM for Sentiment Lexicon Construction. IEEE/ACM Trans Audio Speech Lang Process 27(11):1777–1790
Article Google Scholar
He K, Wang W, Wang X, Hopcroft JE (2019) A new anchor word selection method for the separable topic discovery. Wiley Interdiscip Rev Data Mining Knowledge Discov 9(5):1313–1318
Article Google Scholar
Lee CW, Wu YL, Yu LC (2019) Combining mutual information and entropy for unknown word extraction from multilingual code-switching sentences. J Inf Sci Eng 35(3):597–610
Google Scholar
Lee Y, Park S, Yu K, Kim J (2018) Building place-specific sentiment Lexicon. In: Proceedings of the 2nd international conference on digital signal processing (ICDSP 2018). Association for Computing Machinery, Tokyo, Japan, ACM, February 25–27, 2018, pp.147–150
Li M, Lu Q, Long Y, Gui L (2017) Inferring affective meanings of words from word embedding. IEEE Trans Affect Comput 8(4):443–456
Article Google Scholar
Li W, Guo K, Shi Y, Zhu L, Zheng Y (2018) DWWP: Domain-specific new words detection and word propagation system for sentiment analysis in the tourism domain. Knowl-Based Syst 146(15):203–214
Article Google Scholar
Li X, Wu B, Zhang B (2016) Unknown word detection in song poetry. In: IEEE International conference on data science in cyberspace (DSC), Changsha, China, June 13–16, 2016, pp.544–549
Lin CW, Shao Y, Zhang J, Yun U (2020) Enhanced sequence labeling based on latent variable conditional random fields. Neurocomputing 403:431–440
Article Google Scholar
Lu K, Wu J (2019) Sentiment analysis of film review texts based on sentiment dictionary and SVM. In: Proceedings of the 2019 3rd international conference on innovation in artificial intelligence (ICIAI 2019), Suzhou, China, ACM, March 15, 2019, pp.73–77
Manek AS, Shenoy PD, Mohan MC (2017) Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. Word Wide Web 20(2):135–154
Article Google Scholar
Pandey PP, Soni MN (2019) Sentiment analysis on customer feedback data: amazon product reviews. In: 2019 International conference on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India, February, 2019, pp.320–322
Peng Q, Zhang Y, Zhang Y, Jason B, Christopher DM (2020) Stanza: a python natural language processing toolkit for many human languages. In: Proceedings of the 58th annual meeting of the association for computational linguistics: system demonstrations (ACL 2020), Online, July 5–10, 2020, pp.101–108
Pota M, Marulli F, Esposito M, Pietro GD, Fujita H (2019) Multilingual POS tagging by a composite deep architecture based on character-level features and on-the-fly enriched Word Embeddings. Knowl-Based Syst 164:309–323
Article Google Scholar
Sarna G, Bhatia M P S. (2016) A probalistic approach to automatically extract new words from social media. In: Proceedings of the 2016 IEEE/ACM international conference on advances in social networks analysis and mining. San Francisco, CA, USA, August 18–21, 2016, pp.719–725
SinghSN, Sarraf T (2020) Sentiment analysis of a product based on user reviews using random forests algorithm. In: 2020 10th international conference on cloud computing, data science & engineering (Confluence), Noida, India, April 9, 2020, pp.112–116
Sun X, Ma S, Zhang Y, Ren X (2019) Towards easier and faster sequence labeling for natural language processing: a search-based probabilistic online learning framework (SAPO). Inf Sci 478:303–317
Article MathSciNet Google Scholar
Sun X, Sun S, Yin M, Yang H (2020) Hybrid neural conditional random fields for multi-view sequence labeling. Knowledge-Based Syst 189:105151
Article Google Scholar
Wang L, Li S, Yan Q, Zhou G (2018) Domain-specific named entity recognition with document-level optimization. ACM Trans Asian Low Resour Lang Inf Process 17(4):1–15
Google Scholar
Wang W, Bao F, Gao G (2019) Learning morpheme representation for mongolian named entity recognition. Neural Process Lett 50(3):2647–2664
Article Google Scholar
Wu F, Huang Y, Yuan Z (2017) Domain-specific sentiment classification via fusing sentiment knowledge from multiple sources. Inf Fusion 35:26–37
Article Google Scholar
Wu C, Wu F, Liu J, Huang Y, Xie X. (2019) Sentiment lexicon enhanced neural sentiment classification. In: Proceedings of the 28th ACM international conference on information and knowledge management (CIKM 2019), Beijing, China, ACM, November 3, 2019, pp.1091–1100
Yan L, Bai B, Chen W, Wu D (2017) New word extraction from Chinese financial documents. IEEE Signal Process Lett 24(6):770–773
Article Google Scholar
Zhang S, Wei Z, Wang Y (2018) Sentiment analysis of Chinese micro-blog text based on extended sentiment dictionary. Futur Gener Comput Syst 81:395–403
Article Google Scholar
Zhang S, Hu Z, Zhu G, Jin M, Li K (2021) Sentiment classification model for Chinese micro-blog comments based on key sentences extraction. Soft Comput 25:463–476
Article Google Scholar
Zhao W, Guan Z, Chen L, He X, Cai D, Wang B, Wang Q (2018) Weakly-supervised deep embedding for product review sentiment analysis. IEEE Trans Knowl Data Eng 30(1):185–197
Article Google Scholar
Zhao M, Zhang T, Chai J. (2016) Based on SO-PMI algorithm to discriminate sentimental words' polarity in TV programs' subjective evaluation. In: 2016 9th International symposium on computational intelligence and design (ISCID), Hangzhou, China, May 12, 2016, pp.38–40
Zhou D, Zhang Z, Zhang M, He Y (2018) Weakly supervised POS tagging without disambiguation. ACM Trans Asian Low Resour Lang Inf Process 17(4):1–19
Article Google Scholar
Zhu G, Pan Z, Wang Q, Zhang S, Li K (2020) Building multi-subtopic Bi-level network for micro-blog hot topic based on feature co-Occurrence and semantic community division. J Net Comput Appl 170:102815
Article Google Scholar

Download references

Acknowledgments

This research work was supported in part by the National Natural Science Foundation of China (Grant No. 62076006), in part by the 2019 Anhui Provincial Natural Science Foundation Project (Grant No. 1908085MF189).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan, 232001, People’s Republic of China
Shunxiang Zhang, Hanqing Xu, Guangli Zhu & Xiang Chen
Department of Computer Science and Information Engineering (CSIE), Providence University, Taizhong, 43301, Taiwan
KuanChing Li

Authors

Shunxiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hanqing Xu
View author publications
You can also search for this author in PubMed Google Scholar
Guangli Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
KuanChing Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shunxiang Zhang or KuanChing Li.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

No humans or any individual participants are involved in this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, S., Xu, H., Zhu, G. et al. A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews. Soft Comput 26, 853–866 (2022). https://doi.org/10.1007/s00500-021-06228-9

Download citation

Accepted: 30 August 2021
Published: 24 September 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s00500-021-06228-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Abstract

Access this article

Similar content being viewed by others

Sentiment-Bearing New Words Mining: Exploiting Emoticons and Latent Polarities

Extracting Sentiments by Using Fine-Grained Mining

Learning patterns for discovering domain-oriented opinion words

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Abstract

Access this article

Similar content being viewed by others

Sentiment-Bearing New Words Mining: Exploiting Emoticons and Latent Polarities

Extracting Sentiments by Using Fine-Grained Mining

Learning patterns for discovering domain-oriented opinion words

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation