Clustering Context-Dependent Opinion Target Words in Chinese Product Reviews

Zhang, Yu; Liu, Miao; Xia, Hai-Xia

doi:10.1007/s11390-015-1586-2

Clustering Context-Dependent Opinion Target Words in Chinese Product Reviews

Regular Papers
Published: 14 September 2015

Volume 30, pages 1109–1119, (2015)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Yu Zhang¹,
Miao Liu¹ &
Hai-Xia Xia¹

103 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

In opinion mining of product reviews, an important task is to provide a summary of customers’ opinions based on different opinion targets. Due to various knowledge backgrounds or linguistic habits, customers use a variety of terms to describe the same opinion target. These terms are called as context-dependent synonyms. In order to provide a comprehensive summary, the first step is to classify these opinion target words into groups. In this article, we mainly focus on clustering context-dependent opinion target words in Chinese product reviews. We utilize three clustering methods based on distributional similarity and use four different co-occurrence matrices for experiments. According to the experimental results on a large number of reviews, we find that our proposed heuristic k-means clustering method using opinion target words co-occurrence matrix achieves the best clustering result with lower time complexity and less memory space. In addition, the accuracy is more stable when choosing different combinations of centroids. For some kinds of co-occurrence matrices, we also find that using small-size (low-dimensional) matrices achieves higher average clustering accuracy than using large-size (high-dimensional) matrices. Our findings provide a time-efficient and space-efficient way to cluster opinion targets with high accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review on customer segmentation methods for personalized customer targeting in e-commerce use cases

Article Open access 09 June 2023

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Article 28 November 2018

Sentiment analysis of movie reviews based on NB approaches using TF–IDF and count vectorizer

Article 16 April 2024

References

Li D, Shuai X, Sun G, Tang J, Ding Y, Luo Z. Mining topic-level opinion influence in microblog. In Proc. the 21st ACM International Conference on Information and Knowledge Management, Oct. 29-Nov. 2, 2012, pp. 1562–1566.
Socher R, Perelygin A, Wu J Y, Chuang J, Manning C D, Ng A Y, Potts C. Recursive deep models for semantic compositionality over a sentiment treebank. In Proc. the 2013 Conference on Empirical Methods in Natural Language Processing, Oct. 2013, pp. 1631–1642.
Poria S, Cambria E, Winterstein G, Huang G B. Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems, 2014, 69: 45–63.
Zhai Z, Liu B, Xu H, Jia P. Constrained LDA for grouping product features in opinion mining. In Proc. the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Part 1, May 2011, pp. 448–459.
Cambria E, Mazzocco T, Hussain A, Eckl C. Sentic medoids: Organizing affective common sense knowledge in a multi-dimensional vector space. In Proc. the 8th International Symposium on Neural Networks, Part 3, May 29-Jun. 1, 2011, pp. 601–610.
Cambria E, Hussain A, Havasi C, Eckl C, Munro J. Towards crowd validation of the UK national health service. In Proc. the Web Science Conference 2010, Apr. 2010.
Deshpande B. How to use clustering for product categorization or segmentation. Feb. 2013. http://www.simafore.com/blog/bid/113689/How-to-use-clustering-for-product-cate-gorization-or-segmentation, Aug. 2015.
Agirre E, Alfonseca E, Hall K, Kravalova J, Pasca M, Soroa A. A study on similarity and relatedness using distributional and WordNet-based approaches. In Proc. the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 31-Jun. 5, 2009, pp. 19–27.
Carenini G, Ng R T, Zwart E. Extracting knowledge from evaluative text. In Proc. the 3rd International Conference on Knowledge Capture, Oct. 2005, pp. 11–18.
Wagstaff K, Cardie C, Rogers S, Schrödl S. Constrained k-means clustering with background knowledge. In Proc. the 18th International Conference on Machine Learning, Jun. 28-Jul. 1, 2001, pp. 577–584.
Zhai Z, Liu B, Xu H, Jia P. Clustering product features for opinion mining. In Proc. the 4th International Conference on Web Search and Data Mining, Feb. 2011, pp. 347–354.
Lin D, Wu X. Phrase clustering for discriminative learning. In Proc. the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Aug. 2009, pp. 1030–1038.
Deerwester S, Dumais S T, Furnas G W, Landauer T K, Harshman R. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41(6): 391–407.
Sahami M, Heilman T D. A web-based kernel function for measuring the similarity of short text snippets. In Proc. the 15th International Conference on World Wide Web, May 2006, pp. 377–386.
Bu F, Zhu X, Li M. Measuring the non-compositionality of multiword expressions. In Proc. the 23rd International Conference on Computational Linguistics, Aug. 2010, pp. 116–124.
Pantel P, Crestan E, Borkovsky A, Popescu A M, Vyas V. Web-scale distributional similarity and entity set expansion. In Proc. the 2009 Conference on Empirical Methods in Natural Language Processing, Aug. 2009, pp. 938–947.
Andrzejewski D, Zhu X, Craven M. Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. In Proc. the 26th Annual International Conference on Machine Learning, Jun. 2009, pp. 25–32.
Zhao S, Liu T, Li S. A topical document clustering method. Journal of Chinese Information Processing, 2007, 21(2): 58–62. (in Chinese)
Elsner M, Charniak E, Johnson M. Structured generative models for unsupervised named-entity clustering. In Proc. the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 31-Jun. 5, 2009, pp. 164–172.
Andrews N, Eisner J, Dredze M. Robust entity clustering via phylogenetic inference. In Proc. the 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 1: Long Papers, Jun. 2014, pp. 775–785.
Green S, Andrewst N, Gormleyt M R, Dredzet M, Manning C D. Entity clustering across languages. In Proc. the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2012, pp. 60–69.
Chen J, Zhao Z, Ye J, Liu H. Nonlinear adaptive distance metric learning for clustering. In Proc. the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2007, pp. 123–132.
Li F, Han C, Huang M, Zhu X, Xia Y, Zhang S, Yu H. Structure-aware review mining and summarization. In Proc. the 23rd International Conference on Computational Linguistics, Aug. 2010, pp. 653–661.
Zhang Y, Zhu W. Extracting implicit features in online customer reviews. In Proc. the 22nd International Conference on World Wide Web Companion, May 2013, pp. 103–104.

Download references

Author information

Authors and Affiliations

School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou, 310018, China
Yu Zhang, Miao Liu & Hai-Xia Xia

Authors

Yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Miao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hai-Xia Xia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Zhang.

Additional information

This work was supported by the Commonweal Technical Project of Zhejiang Province of China under Grant No. 2013C33063, the National Natural Science Foundation of China under Grant Nos. 61100183, 61402417, the Natural Science Foundation of Zhejiang Province of China under Grant No. LQ13F020014, and the 521 Talents Project of Zhejiang Sci-Tech University.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Liu, M. & Xia, HX. Clustering Context-Dependent Opinion Target Words in Chinese Product Reviews. J. Comput. Sci. Technol. 30, 1109–1119 (2015). https://doi.org/10.1007/s11390-015-1586-2

Download citation

Received: 15 November 2014
Revised: 08 June 2015
Published: 14 September 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s11390-015-1586-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering Context-Dependent Opinion Target Words in Chinese Product Reviews

Abstract

Access this article

Similar content being viewed by others

A review on customer segmentation methods for personalized customer targeting in e-commerce use cases

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Sentiment analysis of movie reviews based on NB approaches using TF–IDF and count vectorizer

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering Context-Dependent Opinion Target Words in Chinese Product Reviews

Abstract

Access this article

Similar content being viewed by others

A review on customer segmentation methods for personalized customer targeting in e-commerce use cases

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

Sentiment analysis of movie reviews based on NB approaches using TF–IDF and count vectorizer

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation