Abstract
The explosive growth of Chinese electronic market has made it possible for companies to better understand consumers’ opinion towards their products in a timely fashion through their online reviews. This study proposes a framework for extracting knowledge from online reviews through text mining and econometric analysis. Specifically, we extract product features, detect topics, and identify determinants of customer satisfaction. An experiment on the online reviews from a Chinese leading B2C (Business-to-Customer) website demonstrated the feasibility of the proposed method. We also present some findings about the characteristics of Chinese reviewers.
Similar content being viewed by others
References
Agrawal, R., & Srikant, R. (1994). Fast algorithm for mining association rules. VLDB’94.
Archak, N., Ghose, A., & Ipeirotis, P. (2007). Show me the money! deriving the pricing power of product features by mining consumer reviews. In KDD 2007, 56–65.
Basuroy, S., Chatterjee, S., & Ravid, S. (2003). How critical are critical reviews? The box office effects of film critics, star power, and budgets. Journal of Marketing, 64(7), 103–117.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(5), 993–1022.
Cardozo, R. (1965). An experimental study of customer effort, expectation, and satisfaction. Journal of Marketing Research, 2(3), 244–249.
Chatterjee, P. (2001). Online reviews – do consumers use them? In M. C. Gilly & J. Myers-Levy (Eds.), ACR 2001 Proceedings (pp. 129–134). Provo: Association for Consumer Research.
Chen, P., Dhanasobhon, S., & Smith, M. (2008). All reviews are not created equal: the disaggregate impact of reviews and reviewers at Amazon.com. papers.ssrn.com
Chevalier, J., & Mayzlin, D. (2006). The effect of word of mouth on sales: online book reviews. Journal of Marketing Research, 43(3), 345–354.
Churchill, G. A., & Surprenant, C. (1982). An investigation into the determinants of customer satisfaction. Journal of Marketing Research, 19(4), 491–504.
Cronin, J., Brady, M. K., & Hult, G. T. (2000). Assessing the effects of quality, value, and customer satisfaction on consumer behavioral intentions in service environments. Journal of Retailing, 76(2), 193–218.
Davenport, T. H., Harris, J. G., & Kohli, A. K. (2001). How do they know their customers so well? MIT Sloan Management Review, 42(2), 63–73.
Day, G. S. (2000). Capabilities for forging customer relationships. Cambridge: Marketing Science Institute.
Dellarocas, C., & Wood, C. A. (2008). The sound of silence in online feedback: estimating trading risks in the presence of reporting bias. Management Science, 54(3), 460–476.
Devaraj, S., Fan, M., & Kohli, R. (2002). Antecedents of B2C channel satisfaction and preferences: validating e-commerce metrics. Information Systems Research, 13(3), 316–333.
Eisingerich, A. B., & Rubera, G. (2010). Drivers of brand commitment: a cross-national investigation. Journal of International Marketing, 18(2), 64–79. doi:10.1509/jimk.18.2.64.
Evans, D., Bratton, S., & McKee, J. (2010). Social media marketing: the next generation of business engagement. Indianapolis: Wiley.
Forman, C., Ghose, A., & Wiesenfeld, B. (2008). Examining the relationship between reviews and sales: the role of reviewer identity disclosure in electronic markets. Information Systems Research, 19(3), 291–313.
Fornell, C., Johnson, M. D., Anderson, E. W., et al. (1996). The American customer satisfaction index: nature, purpose, and findings. The Journal of Marketing, 60(4), 7–18.
Garcia-Murillo, M., & Annabi, H. (2002). Customer knowledge management. Journal of the Operational Research Society, 53(8), 875–884.
Grazioli, S., & Jarvenpaa, S. (2003). Consumer and business deception on the internet: content analysis of documentary evidence. International Journal of Electronic Commerce, 7(4), 93–118.
Gregg, D., & Scott, J. (2008). A typology of complaints about eBay sellers. Communications of the ACM, 51(4), 69–74.
Gruhl, D., Guha, R., Kumar, R., Novak, J., & Tomkins, A. (2005). The predictive power of online chatter. Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining - KDD ’05, 78.
Hofmann, T. (2001). Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1), 177–196.
Hu, M., & Liu, B. (2004a). Mining opinion features in customer reviews. In Proceeding of the 2004 AAAI Spring Symposium Series: Semantic Web Services, 755–760.
Hu, M., & Liu, B. (2004b). Mining and summarizing customer reviews. In Proc. 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177, Seattle: ACM Press.
Hu, M., & Liu, B. (2006). Opinion feature extraction using class sequential rules. In Proc. of the Spring Symposia on Computational Approaches to Analyzing Weblogs.
Hu, N., Liu, L., & Zhang, J. J. (2008). Do online reviews affect product sales? The role of reviewer characteristics and temporal effects. Information Technology and Management, 9(3), 201–214.
Hu, N., Pavlou, P. A., & Zhang, J. (2006). Can online reviews reveal a product’s true quality? Empirical findings and analytical modeling of online word-of-mouth communication. In A. Arbor, Proceedings of the 7th ACM conference on Electronic commerce, 324–330
Institute of Computational Linguistics, Peking University (2011). Standards of Chinese text POS tagging, http://icl.pku.edu.cn/icl_res/segtag98/catetkset.html, accessed Nov 30, 2011.
iResearch (2010). Annual statistics of online shopping in China. http://ec.iresearch.cn/html/131768.shtml, accessed Feb 12, 2011.
iResearch (2011) Jingdong Mall’s sales exceeds 10.2 billion in 2010 and release of data about notebook, http://ec.iresearch.cn/html/20110225/133615.shtml, accessed Nov 30, 2011.
Jarvenpaa, S.L., Tractinsky, N., Saarinen, L., & Vitale, M. (1999). Consumer trust in an Internet store: a cross-cultural validation. Journal of Computer-Mediated Communication, 5(2), 23. http://www.ascusc.org/jcmc/vol5/issue2/jarvenpaa.html.
Joshi, A. W., & Sharma, S. (2004). Customer knowledge development: antecedents and impact on new product performance. Journal of Marketing, 68(4), 47–59.
Kacen, J., & Lee, J. (2002). The influence of culture on consumer impulsive buying behavior. Journal of Consumer Psychology, 12(2), 163–176.
Kobayashi, N., Inui, K., Matsumoto, Y., et al. (2004). Collecting evaluative expressions for opinion extraction. In Proceedings of the 1st International Joint Conference on Natural Language Processing. China, 3248, 584–589.
Kokkoras, F., Lampridou, E., Ntonas, K., & Vlahavas, I. (2008). Mopis: a multiple opinion summarizer. SETN’08, 110–122.
Krippendorff, K. (2004). Content analysis: an introduction to its methodology: Sage Publications, Inc.
Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proc. ICML-01, 282–289.
Landis, J. R., & Koch, G. G. (1977). An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics, 33(2), 363–374.
Lee, J., & Lee, J. (2009). Understanding the product information inference process in electronic word-of-mouth: an objectivity–subjectivity dichotomy perspective. Information & Management, 46(5), 302–311.
Li, X., & Hitt, L. M. (2008). Self-selection and information role of online product reviews. Information Systems Research, 19(4), 456–474.
Li, S., Ye, Q., Li, Y., & Law, R. (2009). Mining features of products from Chinese customer online reviews. Journal of Management Sciences in China, 12, 142–152.
Liao, S., Hsieh, C., & Huang, S. (2008). Mining product maps for new product development. Expert Systems with Applications, 34(1), 50–62.
Lioma, C. (2009). Part of speech based term weighting for information retrieval. Advances in Information Retrieval, 412–423.
Liu, R. R., & McClure, P. (2001). Recognizing cross-cultural differences in consumer complaint behavior and intentions: an empirical examination. Journal of Consumer Marketing, 18(1), 54–75.
MacInnes, I., Li, Y., & Yurcik, W. (2005). Reputation and dispute in eBay transactions. International Journal of Electronic Commerce, 10(1), 27–54.
McKinney, V. (2002). Measurement of web-customer satisfaction: an expectation and disconfirmation. Information Systems Research, 13(3), 296–315.
Ohsawa, Y., Benson, N., & Yachida, M. (1998). KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor. In Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL’98- (pp. 12–18).
Oliver, R. (1980). A cognitive model of the antecedents and consequences of satisfaction decisions. Journal of Marketing Research, 17(4), 460–469.
Otterbacher, J. (2008). Information in online product review communities: a comparison of two approaches. Proc. 16th European Conference on Information.
Otterbacher, J. (2009). ‘Helpfulness’ in online communities: a measure of message quality. Proc. CHI, 955–964.
Paice, C. D. (1990). Constructing literature abstracts by computer: techniques and prospects. Information Processing and Management, 26(1), 171–186.
Pavicic, J., Alfirevic, N., & Znidar, K. (2011). Customer knowledge management: toward social CRM. International Journal of Management Cases, 12(3), 203–209.
Pavlou, P. A., & Dimoka, A. (2006). The nature and role of feedback text comments in online marketplaces: implications for trust building, price premiums, and seller differentiation. Information Systems Research, 17(4), 392–414.
Pollach I. (2006). Electronic word of mouth: a genre analysis of product reviews on consumer opinion web sites. Proceedings of the 39th Hawaii International Conference on System Sciences.
Qu, Z., Zhang, H., & Li, H. (2008). Determinants of online merchant rating: content analysis of consumer comments about Yahoo merchants. Decision Support Systems, 46(1), 440–449.
Reinhold, O., Alt, R. (2011). Analytical social CRM: concept and tool support. BLED 2011 Proceedings. Paper 50.
Resnick, P., & Zeckhauser, R. (2002). Trust among strangers in Internet transactions: empirical analysis of eBay’s reputation system. Advances in Applied Microeconomics, 11, 127–157.
Rowley, J. E. (2002). Reflections on customer knowledge management in e-business. Qualitative Market Research: An International Journal, 5(4), 268–280.
Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. McGraw-Hill.
Singhal, A. (2001). Modern information retrieval: a brief overview. IEEE Data Engineering Bulletin, 24(4), 35–43.
Szymanski, D., & Hise, R. (2000). E-satisfaction: an initial examination. Journal of Retailing, 76(3), 309–322.
Titov, I., & McDonald, R. (2008). Modeling online reviews with multi-grain topic models, Proceeding of the 17th international conference on World Wide Web, 111–120.
Tse, D., & Wilton, P. (1988). Models of consumer satisfaction formation: an extension. Journal of Marketing Research, 25(2), 204–212.
Wayland, R., & Cole, P. M. (1997). Customer Connections: New Strategies For Growth. Harvard Business School Pr. 1997.
Wei, W., Liu, H.Y., He, J., Yang. H., and Du, X. (2008). Extracting feature and opinion words effectively from Chinese product reviews. In Fifth International Conference on Fuzzy Systems and Knowledge Discovery. 170–174.
Wei, J., Yang, C., Ma, Q., & Yu, G. (2010). Semi-supervised discriminant analysis method based on local reconstruction and global preserving. Journal of South China University of Technology (Natural Science Edition), 38(7), 50–55 (In Chinese).
Williams, R. (2006). Generalized ordered logit/partial proportional odds models for ordinal dependent variables. The Stata Journal, 6, 58–82.
Wolfinbarger, M. (2003). eTailQ: dimensionalizing, measuring and predicting etail quality. Journal of Retailing, 79(3), 183–198.
Woodcock, N., Green, A., & Starkey, M. (2010). Social CRM as a business strategy. Journal of Database Marketing & Customer Strategy Management, 18(1), 50–64.
Xia, M., Zhai, C., Tan, B., Lu, Y., & Mei, Q. (2009). You are what you write—understanding user online behavior through text mining. In Social Mediating Technologies Workshop, CHI. Boston, MA, USA.
Ye, Q., Zhang, Z., & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications, 36(3), 6527–6535.
Yoo, B., & Donthu, N. (2001). Developing a scale to measure the perceived quality of an Internet shopping. Quarterly Journal of Electronic Commerce, 2(1), 31–47.
Zhan, J., Loh, H. T., & Liu, Y. (2009). Gather customer concerns from online product reviews–a text summarization approach. Expert Systems with Applications, 2009(36), 2107–2115.
Acknowledgment
The research is supported by the Beijing Forestry University Young Scientist Fund (No. BLX201127 and YSE2011-8), National Natural Science Foundation of China under Grant No. 90924020, 71101153, and 70971005, the PhD Program Foundation of Education Ministry of China under Contract No. 200800060005, and Alibaba Young Researcher Funding under Contract No. Ali-2010-B-6.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: Xin Luo
Rights and permissions
About this article
Cite this article
You, W., Xia, M., Liu, L. et al. Customer knowledge discovery from online reviews. Electron Markets 22, 131–142 (2012). https://doi.org/10.1007/s12525-012-0098-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12525-012-0098-y