skip to main content
10.1145/1390334.1390382acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to rank with ties

Authors Info & Claims
Published:20 July 2008Publication History

ABSTRACT

Designing effective ranking functions is a core problem for information retrieval and Web search since the ranking functions directly impact the relevance of the search results. The problem has been the focus of much of the research at the intersection of Web search and machine learning, and learning ranking functions from preference data in particular has recently attracted much interest. The objective of this paper is to empirically examine several objective functions that can be used for learning ranking functions from preference data. Specifically, we investigate the roles of ties in the learning process. By ties, we mean preference judgments that two documents have equal degree of relevance with respect to a query. This type of data has largely been ignored or not properly modeled in the past. In this paper, we analyze the properties of ties and develop novel learning frameworks which combine ties and preference data using statistical paired comparison models to improve the performance of learned ranking functions. The resulting optimization problems explicitly incorporating ties and preference data are solved using gradient boosting methods. Experimental studies are conducted using three publicly available data sets which demonstrate the effectiveness of the proposed new methods.

References

  1. R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Bradley and M.Terry. The rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39, 1952.Google ScholarGoogle Scholar
  3. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. J. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost functions. In Advances in Neural Information Processing Systems 19. 2007.Google ScholarGoogle Scholar
  5. Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Huang, and H.-W. Hon. Adapting ranking svm to document retrieval. In Proceedings of the 29th ACM SIGIR, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on Machine learning, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Cortes, M. Mohri, and A. Rastogi. Magnitude-preserving ranking algorithms. In Proceedings of the 24th ICML, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. A. David. The Method of Paired Comparisons. Oxford University Press, 2nd edition, 1988.Google ScholarGoogle Scholar
  9. Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. In Proceedings of the Fifteenth International Conference on Machine Learning, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Friedman. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 29, 2001.Google ScholarGoogle Scholar
  11. J. Friedman. Stochastic gradient boosting. Computational Statistics and Data Analysis, 38, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Fung, R. Rosales, and B. Krishnapuram. Learning rankings via convex hull separation. In Advances in Neural Information Processing Systems 18. 2006.Google ScholarGoogle Scholar
  13. T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  14. R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers, Cambridge, MA, 2000. MIT Press.Google ScholarGoogle Scholar
  15. K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Transaction of Information Systems, 20(4), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of ACM SIGKDD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. T. Joachims. A support vector method for multivariate performance measures. In Proceedings of the 22nd international conference on Machine learning, 05. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th ACM SIGIR conference, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the 24th ACM SIGIR, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of the Learning to Rank workshop in the 30th ACM SIGIR, 2007.Google ScholarGoogle Scholar
  21. F. Radlinski and T. Joachims. Active exploration for learning rankings from clickthrough data. In Proceedings of the 13th ACM SIGKDD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Rao and L. Kupper. Ties in paired-comparison experiments: A generalization of the bradley-terry model,. Journal of the American Statistical Association, 62, March 1967.Google ScholarGoogle Scholar
  23. S. Robertson and D. A. Hull. The TREC-9 filtering track final report. Proceedings of the 9th Text REtrieval Conference (TREC-9), 2001.Google ScholarGoogle Scholar
  24. G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Commun. ACM, 18(11), 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Song and W. B. Croft. A general language model for information retrieval. In Proceedings of CIKM, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Thurstone. A law of comparative judgement. Psychological Review, 34, 1927.Google ScholarGoogle Scholar
  27. M.-F. Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, and W.-Y. Ma. Frank: a ranking method with fidelity loss. In Proceedings of ACM SIGIR, pages 383--390, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In Proceedings of the 30th ACM SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proceedings of ACM SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Z. Zheng, K. Chen, G. Sun, and H. Zha. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of the 30th ACM SIGIR conference, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In Advances in Neural Information Processing Systems. 2007.Google ScholarGoogle Scholar

Index Terms

  1. Learning to rank with ties

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
        July 2008
        934 pages
        ISBN:9781605581644
        DOI:10.1145/1390334

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 July 2008

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader