Abstract
We consider the extension of standard decision tree methods to the bipartite ranking problem. In ranking, the goal pursued is global: define an order on the whole input space in order to have positive instances on top with maximum probability. The most natural way of ordering all instances consists in projecting the input data x onto the real line using a real-valued scoring function s and the accuracy of the ordering induced by a candidate s is classically measured in terms of the AUC. In the paper, we discuss the design of tree-structured scoring functions obtained by maximizing the AUC criterion. In particular, the connection with recursive piecewise linear approximation of the optimal ROC curve both in the L 1-sense and in the L ∞ -sense is discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, S., Graepel, T., Herbrich, R., Har-Peled, S., Roth, D.: Generalization bounds for the area under the ROC curve. Journal of Machine Learning Research 6, 393–425 (2005)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks (1984)
Clémençon, S., Lugosi, G., Vayatis, N.: Ranking and scoring using empirical risk minimization. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS (LNAI), vol. 3559, pp. 1–15. Springer, Heidelberg (2005)
Clémençon, S., Lugosi, G., Vayatis, N.: Ranking and empirical risk minimization of U-statistics. The Annals of Statistics 36, 844–874 (2008)
Cortes, C., Mohri, M.: Auc optimization vs. error rate minimization. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16. MIT Press, Cambridge (2004)
Clémençon, S., Vayatis, N.: Tree-structured ranking rules and approximation of the optimal ROC curve. Technical Report hal-00268068, HAL (2008)
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, Heidelberg (1996)
Devore, R., Lorentz, G.: Constructive Approximation. Springer, Heidelberg (1993)
Egan, J.P.: Signal Detection Theory and ROC Analysis. Academic Press, London (1975)
Ferri, C., Flach, P.A., Hernández-Orallo, J.: Learning decision trees using the area under the roc curve. In: ICML 2002: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 139–146. Morgan Kaufmann Publishers Inc., San Francisco (2002)
Freund, Y., Iyer, R.D., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4, 933–969 (2003)
Györfi, L., Köhler, M., Krzyzak, A., Walk, H.: A Distribution-Free Theory of Nonparametric Regression. Springer, Heidelberg (2002)
Hanley, J.A., McNeil, J.: The meaning and use of the area under a ROC curve. Radiology 143, 29–36 (1982)
Provost, F., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52(3), 199–215 (2003)
Rakotomamonjy, A.: Optimizing area under roc curve with svms. In: Proceedings of the First Workshop on ROC Analysis in AI (2004)
Xia, F., Zhang, W., Wang, J.: An effective tree-based algorithm for ordinal regression. IEEE Intelligent Informatics Bulletin 7(1), 22–26 (2006)
Yan, L., Dodier, R.H., Mozer, M., Wolniewicz, R.H.: Optimizing classifier performance via an approximation to the wilcoxon-mann-whitney statistic. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), pp. 848–855 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Clémençon, S., Vayatis, N. (2008). Approximation of the Optimal ROC Curve and a Tree-Based Ranking Algorithm. In: Freund, Y., Györfi, L., Turán, G., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2008. Lecture Notes in Computer Science(), vol 5254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87987-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-87987-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87986-2
Online ISBN: 978-3-540-87987-9
eBook Packages: Computer ScienceComputer Science (R0)