Abstract
We study the subset ranking problem, motivated by its important application in web-search. In this context, we consider the standard DCG criterion (discounted cumulated gain) that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, the DCG criterion leads to a non-convex optimization problem that can be NP-hard. Therefore a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the generalization ability of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bartlett, P., Jordan, M., McAuliffe, J.: Convexity, classification, and risk bounds. Technical Report 638, Statistics Department, University of California, Berkeley (2003) (to appear in JASA)
Lugosi, G., Vayatis, N.: On the Bayes-risk consistency of regularized boosting methods. The Annals of Statistics 32, 30–55 (2004)
Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics 32, 56–85 (2004)
Zhang, T.: Statistical analysis of some multi-category large margin classification methods. Journal of Machine Learning Research 5, 1225–1251 (2004)
Steinwart, I.: Support vector machines are universally consistent. J. Complexity 18, 768–791 (2002)
Tewari, A., Bartlett, P.L.: On the consistency of multiclass classification methods. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 143–157. Springer, Heidelberg (2005)
Jarvelin, K., Kekalainen, J.: IR evaluation methods for retrieving highly relevant documents. In: SIGIR 2000, pp. 41–48 (2000)
Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: ICML 2005 (2005)
Hanley, J., McNeil, B.: The meaning and use of the Area under a Receiver Operating Characetristic (ROC) curve. Radiology, 29–36 (1982)
Agarwal, S., Graepel, T., Herbrich, R., Har-Peled, S., Roth, D.: Generalization bounds for the area under the ROC curve. Journal of Machine Learning Research 6, 393–425 (2005)
Agarwal, S., Roth, D.: Learnability of bipartite ranking functions. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 16–31. Springer, Heidelberg (2005)
Clémençon, S., Lugosi, G., Vayatis, N.: Ranking and scoring using empirical risk minimization. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 1–15. Springer, Heidelberg (2005)
Rosset, S.: Model selection via the AUC. In: ICML 2004 (2004)
Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. In: Smola, A., Bartlett, P., Schölkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 115–132. MIT Press, Cambridge (2000)
Cossock, D.: Method and apparatus for machine learning a document relevance function. US patent application, 20040215606 (2003)
Blanchard, G., Lugosi, G., Vayatis, N.: On the rate of convergence of regularized boosting classifiers. Journal of Machine Learning Research 4, 861–894 (2003)
Mannor, S., Meir, R., Zhang, T.: Greedy algorithms for classification - consistency, convergence rates, and adaptivity. Journal of Machine Learning Research 4, 713–741 (2003)
Friedman, J.: Greedy function approximation: A gradient boosting machine. The Annals of Statistics 29, 1189–1232 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cossock, D., Zhang, T. (2006). Subset Ranking Using Regression. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_44
Download citation
DOI: https://doi.org/10.1007/11776420_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)