Abstract
In the field of opinion mining, extraction of fine-grained product feature is a challenging problem. Noun is the most important features to represent product features. Generative model such as the latent Dirichlet allocation (LDA) has been used for detecting keyword clusters in document corpus. As adjectives often dominate review corpus, they are often excluded from the vocabulary in such generative model for opinion sentiment analysis. On the other hand, adjectives provide useful context for noun features as they are often semantically related to the nouns. To take advantage of such semantic relations, dependency tree is constructed to extract pairs of noun and adjective with semantic dependency relation. We propose a semantic dependent word pairs generative model for pairs of noun and adjective for each sentence. Product features and their corresponding adjectives are simultaneously clustered into distinct groups which enable improved accuracy of product features as well as providing clustered adjectives. Experimental results demonstrated the advantage of our models with lower perplexity, average cluster entropies, compared to baseline models based on LDA. Highly semantic cohesive, descriptive and discriminative fine-grained product features are obtained automatically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hu, M., Liu, B.: Mining and Summarizing Customer Reviews. In: ACM SIGKDD (2004)
Ding, X., Liu, B., Yu, P.S.: A Holistic Lexicon based Approach to Opinion Mining. In: WSDM, pp. 231–239 (2008)
Brody, S., Elhadad, N.: An Unsupervised Aspect-Sentiment Model for Online Reviews. In: NAACL, Los Angeles, CA, pp. 804–812 (2010)
Guo, H., Zhu, H., Guo, Z., Zhang, X.X., Su, Z.: Product feature categorization with multilevel latent semantic association. In: CIKM, pp. 108–121 (2009)
Raju, S., Shishtla, P., Varma, V.: A Graph Clustering Approach to Product Attribute Extraction. In: Indian International Conference on Artificial Intelligence (2009)
Joshi, M., Penstein-Ros, C.: Generalizing dependency features for opinion mining. In: ACL-IJCNLP (2009)
Zhan, T.J., Li, C.H.: Product Feature Mining with Nominal Semantic Structure. In: IEEE/WIC/ACM International Conference on Web Intelligence (2010)
Wallach, H.: Structured Topic Models for Language”, PhD thesis, University of Cambridge (2008)
A NLP package, http://opennlp.sourceforge.net/
Blei, D.M., Ng, A.Y., Jordan, M.I., Lafferty, J.: Latent Dirichlet Allocation. JMLR (2003)
Nivre, J., Hall, J.: MaltParser: A Language-Independent System for Data-Driven Dependency Parsing. In: Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (2005)
Ganu, G., Elhadad, N., Marian, A.: Beyond the stars: Improving rating predictions using review text content. In: WebDB (2009)
TripAdvisor datasets discussed in this paper, http://patty.isti.cnr.it/~baccianella/reviewdata/
Wallach, H., Mimno, D., McCallum, A.: Rethinking LDA: Why Priors Matter. In: NIPS (2009)
A popular public stop-word list, http://truereader.com/manuals/onix/stopwords.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhan, TJ., Li, CH. (2011). Semantic Dependent Word Pairs Generative Model for Fine-Grained Product Feature Mining. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6634. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20841-6_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-20841-6_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20840-9
Online ISBN: 978-3-642-20841-6
eBook Packages: Computer ScienceComputer Science (R0)