skip to main content
10.1145/2020408.2020440acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Selecting a comprehensive set of reviews

Published:21 August 2011Publication History

ABSTRACT

Online user reviews play a central role in the decision-making process of users for a variety of tasks, ranging from entertainment and shopping to medical services. As user-generated reviews proliferate, it becomes critical to have a mechanism for helping the users (information consumers) deal with the information overload, and presenting them with a small comprehensive set of reviews that satisfies their information need. This is particularly important for mobile phone users, who need to make decisions quickly, and have a device with limited screen real-estate for displaying the reviews. Previous approaches have addressed the problem by ranking reviews according to their (estimated) helpfulness. However, such approaches do not account for the fact that the top few high-quality reviews may be highly redundant, repeating the same information, or presenting the same positive (or negative) perspective. In this work, we focus on the problem of selecting a comprehensive set of few high-quality reviews that cover many different aspects of the reviewed item. We formulate the problem as a maximum coverage problem, and we present a generic formalism that can model the different variants of review-set selection. We describe algorithms for the different variants we consider, and, whenever possible, we provide approximation guarantees with respect to the optimal solution. We also perform an experimental evaluation on real data in order to understand the value of coverage for users.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Y. Azar, I. Gamzu, and X. Yin. Multiple intents re-ranking. In Proceedings of the 41st annual ACM symposium on Theory of computing, STOC '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Bhaskara, M. Charikar, E. Chlamtac, U. Feige, and A. Vijayaraghavan. Detecting high log-densities: an (1/4) approximation for densest-subgraph. In STOC, pages 201--210, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Calinescu, C. Chekuri, M. Pál, and J. Vondrák. Maximizing a submodular set function subject to a matroid constraint (extended abstract). In IPCO, pages 182--196, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. G. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Carenini, R. T. Ng, and E. Zwart. Extracting knowledge from evaluative text. In Proceedings of the 3rd international conference on Knowledge capture, K-CAP '05, pages 11--18, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Chekuri and A. Kumar. Maximum coverage problem with group budget constraints and applications. In APPROX-RANDOM, pages 72--83, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. H. Chen and D. R. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. In SIGIR, pages 429--436, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Danescu-Niculescu-Mizil, G. Kossinets, J. Kleinberg, and L. Lee. How opinions are received by online communities: a case study on amazon.com helpfulness votes. In WWW '09, pages 141--150, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Ghose and P. G. Ipeirotis. Designing novel review ranking systems: predicting the usefulness and impact of reviews. In ICEC '07, pages 303--310, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In WWW, pages 381--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Hochbaum, editor. Approximation algorithms for NP-hard problems. PWS Publishing Company, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Hu and B. Liu. Mining and summarizing customer reviews. In KDD, pages 168--177, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Hu and B. Liu. Mining opinion features in customer reviews. In AAAI, pages 755--760, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S.-M. Kim, P. Pantel, T. Chklovski, and M. Pennacchiotti. Automatically assessing review helpfulness. In EMNLP, pages 423--430, Sydney, Australia, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Lappas and D. Gunopulos. Efficient confident search in large review corpora. In ECML/PKDD (2), pages 195--210, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Liu, Y. Cao, C.-Y. Lin, Y. Huang, and M. Zhou. Low-quality product review detection in opinion summarization. In EMNLP-CoNLL, pages 334--342, 2007. Poster paper.Google ScholarGoogle Scholar
  18. K. Liu, E. Terzi, and T. Grandison. Highlighting diverse concepts in documents. In SDM, pages 545--556, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  19. Y. Liu, X. Huang, A. An, and X. Yu. Modeling and predicting the helpfulness of online reviews. In ICDM, pages 443--452, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Lu, P. Tsaparas, A. Ntoulas, and L. Polanyi. Exploiting social context for review quality prediction. In WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Lu and C. Zhai. Opinion integration through semi-supervised topic modeling. In WWW, pages 121--130, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Lu, C. Zhai, and N. Sundaresan. Rated aspect summarization of short comments. In WWW, pages 131--140, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A.-M. Popescu, B. Nguyen, and O. Etzioni. Opine: Extracting product features and opinions from reviews. In HLT/EMNLP, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Radev, T. Allison, S. Blair-Goldensohn, J. Blitzer, A. Çelebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J. Otterbacher, H. Qi, H. Saggion, S. Teufel, M. Topper, A. Winkel, and Z. Zhang. MEAD - a platform for multidocument multilingual text summarization. In LREC 2004, Lisbon, Portugal, May 2004.Google ScholarGoogle Scholar
  25. F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. Redundancy, diversity and interdependent document relevance. SIGIR Forum, 43(2):46--52, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In ICML, pages 784--791, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Rafiei, K. Bharat, and A. Shukla. Diversifying web search results. In WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Slivkins, F. Radlinski, and S. Gollapudi. Learning optimally diverse rankings over large document collections. In ICML, 2010.Google ScholarGoogle Scholar
  29. O. Tsur and A. Rappoport. Revrank: a fully unsupervised algorithm for selecting the most helpful book reviews. In ICWSM, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  30. E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S. Amer-Yahia. Efficient computation of diverse query results. In ICDE, pages 228--236, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In KDD, pages 783--792, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Z. Zhang and B. Varadarajan. Utility scoring of product reviews. In CIKM '06, pages 51--57, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Selecting a comprehensive set of reviews

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
          August 2011
          1446 pages
          ISBN:9781450308137
          DOI:10.1145/2020408

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 August 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader