skip to main content
10.1145/1963405.1963417acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Learning to model relatedness for news recommendation

Published:28 March 2011Publication History

ABSTRACT

With the explosive growth of online news readership, recommending interesting news articles to users has become extremely important. While existing Web services such as Yahoo! and Digg attract users' initial clicks by leveraging various kinds of signals, how to engage such users algorithmically after their initial visit is largely under-explored. In this paper, we study the problem of post-click news recommendation. Given that a user has perused a current news article, our idea is to automatically identify "related" news articles which the user would like to read afterwards. Specifically, we propose to characterize relatedness between news articles across four aspects: relevance, novelty, connection clarity, and transition smoothness. Motivated by this understanding, we define a set of features to capture each of these aspects and put forward a learning approach to model relatedness. In order to quantitatively evaluate our proposed measures and learn a unified relatedness function, we construct a large test collection based on a four-month commercial news corpus with editorial judgments. The experimental results show that the proposed heuristics can indeed capture relatedness, and that the learned unified relatedness function works quite effectively.

References

  1. G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. TKDE, 17(6):734--749, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Agarwal, B.-C. Chen, and P. Elango. Explore/exploit schemes for web content optimization. In ICDM '09, pages 1--10, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Allan, C. Wade, and A. Bolivar. Retrieval and novelty detection at the sentence level. In SIGIR '03, pages 314--321, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Billsus and M. J. Pazzani. User modeling for adaptive news access. User Modeling and User-Adapted Interaction, 10(2-3):147--180, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003. Google ScholarGoogle ScholarCross RefCross Ref
  6. T. Bogers and A. van den Bosch. Comparing and evaluating information retrieval algorithms for news recommendation. In RecSys '07, pages 141--144, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. P. Callan. Passage-level evidence in document retrieval. In SIGIR '94, pages 302--310, Dublin, Ireland, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR '98, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In SIGIR '02, pages 299--306, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. S. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In WWW '07, pages 271--280, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Fang, T. Tao, and C. Zhai. A formal study of information retrieval heuristics. In SIGIR '04, pages 49--56, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189--1232, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. Harman. Overview of the third text retrieval conference (trec-3). In TREC, 1994.Google ScholarGoogle Scholar
  14. T. Hofmann. Probabilistic latent semantic indexing. In SIGIR '99, pages 50--57, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. S. Jones and C. J. van Rijsbergen. Report on the need for and the provision of an 'ideal' information retrieval test collection. Technical Report (British Library Research and Development Report No. 5266), Computer Laboratory, University of Cambridge, 1975.Google ScholarGoogle Scholar
  16. M. Kaszkiel and J. Zobel. Effective ranking with arbitrary passages. Journal of the American Society for Information Science and Technology, 52(4):344--364, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Lang. Newsweeder: Learning to filter netnews. In in Proceedings of the 12th International Machine Learning Conference, 1995.Google ScholarGoogle Scholar
  18. V. Lavrenko and W. B. Croft. Relevance-based language models. In SIGIR '01, pages 120--127, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Lin. Divergence measures based on the shannon entropy. IEEE Trans. Infor. Theory, 37:145--151, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. X. Liu and W. B. Croft. Passage retrieval based on language models. In CIKM '02, pages 375--382, McLean, Virginia, USA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Lv and C. Zhai. A comparative study of methods for estimating query language models with pseudo feedback. In Proceedings of CIKM '09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Macdonald, I. Ounis, and I. Soboroff. Overview of trec-2009 blog track. In TREC '09, 2009.Google ScholarGoogle Scholar
  24. J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR '98, pages 275--281, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: an open architecture for collaborative filtering of netnews. In CSCW '94, pages 175--186, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. E. Robertson. The probability ranking principle in ir. pages 281--286, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society of Information Science, 27(3):129--146, 1976.Google ScholarGoogle ScholarCross RefCross Ref
  28. S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at trec-3. In TREC '94, pages 109--126, 1994.Google ScholarGoogle Scholar
  29. J. J. Rocchio. Relevance feedback in information retrieval. In In The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall Inc., 1971.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Journal of the American Society of Information Science, 41(4):288--297, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  31. G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11):613--620, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Shahaf and C. Guestrin. Connecting the dots between news articles. In KDD '10, pages 623--632, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Singhal. Modern information retrieval: a brief overview. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 24:2001, 2001.Google ScholarGoogle Scholar
  34. A. Singhal, C. Buckley, and M. Mitra. Pivoted document length normalization. In SIGIR '96, pages 21--29, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. H. Toda and R. Kataoka. A clustering method for news articles retrieval system. In WWW '05, pages 988--989, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Y. Yang, N. Bansal, W. Dakka, P. Ipeirotis, N. Koudas, and D. Papadias. Query by document. In WSDM '09, pages 34--43, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR '03, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Zhai and J. D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR '01, pages 334--342, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR '02, pages 81--88, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In NIPS '07. 2007.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '11: Proceedings of the 20th international conference on World wide web
    March 2011
    840 pages
    ISBN:9781450306324
    DOI:10.1145/1963405

    Copyright © 2011 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 28 March 2011

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader