Clustering Chinese Product Features with Multilevel Similarity

He, Yu; Song, Jiaying; Nan, Yuzhuang; Fu, Guohong

doi:10.1007/978-3-319-25816-4_28

Yu He¹⁹,
Jiaying Song¹⁹,
Yuzhuang Nan¹⁹ &
…
Guohong Fu¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9427))

Included in the following conference series:

7080 Accesses
1 Citations

Abstract

This paper presents an unsupervised hierarchical clustering approach for grouping co-referred features in Chinese product reviews. To handle different levels of connections between co-referred product features, we consider three similarity measures, namely the literal similarity, the word embedding-based semantic similarity and the explanatory evaluation based contextual similarity. We apply our approach to two corpora of product reviews in car and mobilephone domains. We demonstrate that combining multilevel similarity is of great value to feature normalization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
http://code.google.com/p/word2vec/

References

Bhagat, R., Hovy, E.: What is a paraphrase? Computat. Linguist. 39(3), 463–472 (2013)
Article Google Scholar
Carenini, G., Ng, R.T., Zwart, E.: Extracting knowledge from evaluative text. In: Proceedings of the 3rd International Conference on Knowledge Capture, pp. 11–18 (2005)
Google Scholar
Guo, H., Zhu, H., Guo, Z., Zhang, X.X., Su, Z.: Product feature categorization with multilevel latent semantic association. In: Proceedings of CIKM 2009, pp. 1087–1096 (2009)
Google Scholar
Kim, H.D., Castellanos, M.G., Hsu, M., Zhai, C.X., Dayal, U., Ghosh, R.: Ranking explanatory sentences for opinion summarization. In: Proceedings of SIGIR 2013, pp. 1069–1072 (2013)
Google Scholar
Liu, V.: Sentiment analysis and subjectivity. In: Handbook of Natural Language Processing, vol. 2, pp. 627–666 (2010)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013a)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013b)
Google Scholar
Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of HLT-NAACL 2013, pp. 746–751 (2013c)
Google Scholar
Pavlopoulos, J., Androutsopoulos, I.: Multi-granular aspect aggregation in aspect-based sentiment analysis. In: Proceedings of EACL 2014, pp. 78–87 (2014)
Google Scholar
Zhai, Z., Liu, B., Xu, H., Jia, P.: Grouping product features using semi-supervised learning with soft-constraints. In: Proceedings of COLING 2010, pp. 1272–1280 (2010)
Google Scholar
Zhai, Z., Liu, B., Xu, H., Jia, P.: Clustering product features for opinion mining. In: Proceedings of WSDM 2011, pp. 347–354 (2011)
Google Scholar
Yang, Y., Ma, Y., Lin, H.: Clustering product features in opinion mining. J. Chin. Inf. Process. 26(3), 104–108 (2012)
Google Scholar
He, Y., Pan, D., Fu, G.: Chinese explanatory opinionated sentence recognition based on auto-encoding features. Acta Scientiarum Naturalium Universitatis Pekinensis, 1–7 (2015)
Google Scholar
Skalak, D.: Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithm. In: Proceedings of ICML 1994, pp. 293–301 (1994)
Chapter Google Scholar

Download references

Acknowledgments

This study was supported by National Natural Science Foundation of China under Grant No. 61170148 and the Returned Scholar Foundation of Heilongjiang Province.

Author information

Authors and Affiliations

School of Computer Science and Technology, Heilongjiang University, Harbin, 150080, China
Yu He, Jiaying Song, Yuzhuang Nan & Guohong Fu

Authors

Yu He
View author publications
You can also search for this author in PubMed Google Scholar
Jiaying Song
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhuang Nan
View author publications
You can also search for this author in PubMed Google Scholar
Guohong Fu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guohong Fu .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Maosong Sun
Tsinghua University, Beijing, China
Zhiyuan Liu
Soochow University, Suzhou, Jiangsu, China
Min Zhang
Tsinghua University, Beijing, China
Yang Liu

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, Y., Song, J., Nan, Y., Fu, G. (2015). Clustering Chinese Product Features with Multilevel Similarity. In: Sun, M., Liu, Z., Zhang, M., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2015 2015. Lecture Notes in Computer Science(), vol 9427. Springer, Cham. https://doi.org/10.1007/978-3-319-25816-4_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-25816-4_28
Published: 08 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25815-7
Online ISBN: 978-3-319-25816-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics