An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods

Clémençon, Stéphan; Depecker, Marine; Vayatis, Nicolas

doi:10.1007/s10044-012-0299-1

An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods

Survey
Published: 18 October 2012

Volume 16, pages 475–496, (2013)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Stéphan Clémençon¹,
Marine Depecker¹ &
Nicolas Vayatis²

344 Accesses
7 Citations
Explore all metrics

Abstract

The TreeRank algorithm was recently proposed in [1] and [2] as a scoring-based method based on recursive partitioning of the input space. This tree induction algorithm builds orderings by recursively optimizing the Receiver Operating Characteristic curve through a one-step optimization procedure called LeafRank. One of the aim of this paper is the in-depth analysis of the empirical performance of the variants of TreeRank/LeafRank method. Numerical experiments based on both artificial and real data sets are provided. Further experiments using resampling and randomization, in the spirit of bagging and random forests are developed [3, 4] and we show how they increase both stability and accuracy in bipartite ranking. Moreover, an empirical comparison with other efficient scoring algorithms such as RankBoost and RankSVM is presented on UCI benchmark data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An empirical comparison of random forest-based and other learning-to-rank algorithms

Article 28 October 2019

On the Capability of Classification Trees and Random Forests to Estimate Probabilities

Article 22 April 2024

RR-classifier: a nonparametric classification procedure in multidimensional space based on relative ranks

Article 21 October 2021

Reference

Clémençon S, Vayatis N (2009) Tree-based ranking methods. IEEE Trans Inf Theory 9:4316–4336
Article Google Scholar
Clémençon S, Depecker M, Vayatis N (2011) Adaptive partitioning schemes for bipartite ranking. J Mach Learn 43(1):3169
Google Scholar
Clémençon S, Depecker M, Vayatis N (2009) Bagging ranking trees. In: Proceedings of ICMLA, international conference on machine learning and applications
Clémençon S, Vayatis N (2010) Ranking forests (to be published)
Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969
MathSciNet Google Scholar
Hastie T, Tibshirani R (1990) Generalized additive models. Chapman & Hall, Boca Raton
Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14(1):185–205
Article MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–407
Article MathSciNet MATH Google Scholar
Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, pp 133–142
Pahikkala T, Tsivtsivadze E, Airola A, Boberg J, Salakoski T (2007) Learning to rank with pairwise regularized least-squares. In: Proceedings of SIGIR 2007 workshop on learning to rank for information retrieval, pp 27–33
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of ICML, 22nd international conference on machine learning, pp 89–96
Dodd L, Pepe M (2003) Partial AUC estimation and regression. Biometrics 59(3):614–623
Article MathSciNet MATH Google Scholar
Clémençon S, Vayatis N (2007) Ranking the Best Instances. J Mach Learn Res 8:2671–2699
MathSciNet MATH Google Scholar
Clémençon S, Vayatis N (2008) Empirical performance maximization for linear rank statistics. In: Proceedings of NIPS’08, conference on neural information processing systems, pp 305–312
Rudin C (2009) The P-norm push: a simple convex ranking algorithm that concentrates at the top of the list. J Mach Learn Res 10:2233–2271
MathSciNet MATH Google Scholar
Robertson S, Zaragoza H (2007) On rank-based effectiveness measures and optimization. Inf Retr 10(3):321–339
Article Google Scholar
Bartlett P, Jordan M, McAuliffe J (2006) Convexity classification and risk bounds. J Am Stat Assoc 101(473):138–156
Article MathSciNet MATH Google Scholar
Bartlett P, Tewari A (2007) Sparseness vs estimating conditional probabilities: some asymptotic results. J Mach Learn Res 8:775–790
MathSciNet MATH Google Scholar
Mease D, Wyner A (2008) Evidence contrary to the statistical view of boosting. J Mach Learn Res 9:131–156
Google Scholar
Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, Berlin
Clémençon S, Vayatis N (2010) Overlaying classifiers: a practical approach for optimal scoring. Constr Approx 32(3):619–648
Article MathSciNet MATH Google Scholar
Boucheron S, Bousquet O, Lugosi G (2005) Theory of classification: a survey of recent advances. ESAIM Probab Stat 9:323–375
Article MathSciNet MATH Google Scholar
anley J, McNeil J (1982) The meaning and use of the area under a ROC curve. Radiology 143:29–36
Google Scholar
Clémençon S, Lugosi G, Vayatis N (2008) Ranking and empirical risk minimization of U-statistics. Ann Stat 36:844–874
Article MATH Google Scholar
Ailon N, Mohri M (2010) Preference-based learning to rank. Mach Learn J 80(2):189–211
Google Scholar
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification, regression trees. Wadsworth and Brooks, Monterey
Bach FR, Heckerman D, Eric H (2006) Considering cost asymmetry in learning classifiers. J Mach Learn Res 7:1713–1741
MathSciNet MATH Google Scholar

Download references

Acknowledgments

We warmly thank Cynthia Rudin who kindly provided the code for the P-norm Push algorithm.

Author information

Authors and Affiliations

Télécom ParisTech, LTCI, UMR CNRS, 5141, Paris, France
Stéphan Clémençon & Marine Depecker
ENS Cachan, UniverSud, CMLA, UMR CNRS, 8536, Cachan, France
Nicolas Vayatis

Authors

Stéphan Clémençon
View author publications
You can also search for this author in PubMed Google Scholar
Marine Depecker
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Vayatis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marine Depecker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Clémençon, S., Depecker, M. & Vayatis, N. An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods. Pattern Anal Applic 16, 475–496 (2013). https://doi.org/10.1007/s10044-012-0299-1

Download citation

Received: 17 December 2011
Accepted: 12 September 2012
Published: 18 October 2012
Issue Date: November 2013
DOI: https://doi.org/10.1007/s10044-012-0299-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An empirical comparison of random forest-based and other learning-to-rank algorithms

On the Capability of Classification Trees and Random Forests to Estimate Probabilities

RR-classifier: a nonparametric classification procedure in multidimensional space based on relative ranks

Reference

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An empirical comparison of random forest-based and other learning-to-rank algorithms

On the Capability of Classification Trees and Random Forests to Estimate Probabilities

RR-classifier: a nonparametric classification procedure in multidimensional space based on relative ranks

Reference

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation