Abstract
Online customer reviews offer valuable information for merchants and potential shoppers in e-Commerce and e-Business. However, even for a single product, the number of reviews often amounts to hundreds or thousands. Thus, summarization of multiple customer reviews is helpful to extract the important issues that merchants and customers are concerned with. Existing methods of multi-document summarization divide documents into non-overlapping clusters first and then summarize each cluster of documents individually with the assumption that each cluster discusses a single topic. When applied to summarize customer reviews, it is however difficult to determine the number of clusters a priori without the domain knowledge, and moreover, topics often overlap with each other in a collection of customer reviews. This paper proposes a summarization approach based on the topical structure of multiple customer reviews. Instead of clustering and summarization, our approach extracts topics from a collection of reviews and further ranks the topics based on their frequency. The summary is then generated according to the ranked topics. The evaluation results showed that our approach outperformed the baseline summarization systems, i.e. Copernic summarizer and clustering-summarization, in terms of users’ responsiveness.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Hu, M., Liu, B.: Mining and Summarizing Customer Reviews. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, pp. 168–177 (2004)
Popescu, A.-M., Etzioni, O.: Extracting Product Features and Opinions from Reviews. In: Joint Conference on Human Language Technology / Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), Vancouver, Canada, pp. 339–346 (2005)
Turney, P.D.: Thumbs up or Thumbs down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In: 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, PA, pp. 417–424 (2001)
Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. In: ACL 1997/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 10–17 (1997)
Gong, Y., Liu, X.: Generic Text Summarization using Relevance Measure and Latent Semantic Analysis. In: 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA, pp. 19–25 (2001)
Hovy, E., Lin, C.-Y.: Automated Text Summarization in SUMMARIST. In: ACL 1997/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 18–24 (1997)
Yeh, J.-Y., Ke, H.-R., Yang, W.-P., Meng, I.-H.: Text Summarization using a Trainable Summarizer and Latent Semantic Analysis. Information Processing & Management 41(1), 75–95 (2005)
Mani, I., Bloedorn, E.: Summarizing Similarities and Differences among Related Documents. Information Retrieval 1(1-2), 35–67 (1999)
McKeown, K., Radev, D.R.: Generating Summaries of Multiple News Articles. In: 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA, pp. 74–82 (1995)
Maña-López, M.J.: Multidocument Summarization: An Added Value to Clustering in Interactive Retrieval. ACM Transaction on Information Systems 22(2), 215–241 (2004)
Radev, D.R., Jing, H., Styś, M., Tam, D.: Centroid-Based Summarization of Multiple Documents. Information Processing & Management 40(6), 919–938 (2004)
Van Rijsbergen, C.: Information Retrieval. Butter Worths (1979)
Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Choi, F.Y.Y.: Advances in Domain Independent Linear Text Segmentation. In: 1st North American Chapter of the Association for Computational Linguistics, Seattle, WA, pp. 26–33 (2000)
Hearst, M.A.: TextTiling: Segmenting Text into Multi-paragraph Subtopic Passages. Computational Linguistics 23(1), 33–64 (1997)
Liu, Y.: A Concept-Based Text Classification System for Manufacturing Information Retrieval. Ph.D. Thesis, National University of Singapore (2005)
Ahonen, H.: Finding All Maximal Frequent Sequences in Text. In: ICML 1999 Workshop on Machine Learning in Text Data Analysis, Bled, Slovenia (1999)
Carbonell, J., Goldstein, J.: The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In: 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp. 335–336 (1998)
Copernic summarizer, http://www.copernic.com
Karypis, G.: Cluto: A Software Package for Clustering High Dimensional Datasets, Release 1.5. Department of Computer Science, University of Minnesota (2002)
Jing, H., Barzilay, R., McKeown, K., Elhadad, M.: Summarization Evaluation Methods: Experiments and Analysis. In: AAAI 1998 Workshop on Intelligent Text Summarization, Stanford, CA, pp. 60–68 (1998)
Tombros, A., Sanderson, M.: Advantages of Query Biased Summaries in Information Retrieval. In: 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp. 2–10 (1998)
Mani, I., Klein, G., House, D., Hirschman, L., Firmin, T., Sundheim, B.: SUMMAC: A Text Summarization Evaluation. Natural Language Engineering 8, 43–68 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhan, J., Loh, H.T., Liu, Y. (2008). Summarizing Online Customer Reviews Automatically Based on Topical Structure. In: Filipe, J., Cordeiro, J. (eds) Web Information Systems and Technologies. WEBIST 2007. Lecture Notes in Business Information Processing, vol 8. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68262-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-68262-2_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68257-8
Online ISBN: 978-3-540-68262-2
eBook Packages: Computer ScienceComputer Science (R0)