Skip to main content
Log in

An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods

  • Survey
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The TreeRank algorithm was recently proposed in [1] and [2] as a scoring-based method based on recursive partitioning of the input space. This tree induction algorithm builds orderings by recursively optimizing the Receiver Operating Characteristic curve through a one-step optimization procedure called LeafRank. One of the aim of this paper is the in-depth analysis of the empirical performance of the variants of TreeRank/LeafRank method. Numerical experiments based on both artificial and real data sets are provided. Further experiments using resampling and randomization, in the spirit of bagging and random forests are developed [3, 4] and we show how they increase both stability and accuracy in bipartite ranking. Moreover, an empirical comparison with other efficient scoring algorithms such as RankBoost and RankSVM is presented on UCI benchmark data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Reference

  1. Clémençon S, Vayatis N (2009) Tree-based ranking methods. IEEE Trans Inf Theory 9:4316–4336

    Article  Google Scholar 

  2. Clémençon S, Depecker M, Vayatis N (2011) Adaptive partitioning schemes for bipartite ranking. J Mach Learn 43(1):3169

    Google Scholar 

  3. Clémençon S, Depecker M, Vayatis N (2009) Bagging ranking trees. In: Proceedings of ICMLA, international conference on machine learning and applications

  4. Clémençon S, Vayatis N (2010) Ranking forests (to be published)

  5. Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969

    MathSciNet  Google Scholar 

  6. Hastie T, Tibshirani R (1990) Generalized additive models. Chapman & Hall, Boca Raton

  7. Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14(1):185–205

    Article  MathSciNet  Google Scholar 

  8. Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–407

    Article  MathSciNet  MATH  Google Scholar 

  9. Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, pp 133–142

  10. Pahikkala T, Tsivtsivadze E, Airola A, Boberg J, Salakoski T (2007) Learning to rank with pairwise regularized least-squares. In: Proceedings of SIGIR 2007 workshop on learning to rank for information retrieval, pp 27–33

  11. Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of ICML, 22nd international conference on machine learning, pp 89–96

  12. Dodd L, Pepe M (2003) Partial AUC estimation and regression. Biometrics 59(3):614–623

    Article  MathSciNet  MATH  Google Scholar 

  13. Clémençon S, Vayatis N (2007) Ranking the Best Instances. J Mach Learn Res 8:2671–2699

    MathSciNet  MATH  Google Scholar 

  14. Clémençon S, Vayatis N (2008) Empirical performance maximization for linear rank statistics. In: Proceedings of NIPS’08, conference on neural information processing systems, pp 305–312

  15. Rudin C (2009) The P-norm push: a simple convex ranking algorithm that concentrates at the top of the list. J Mach Learn Res 10:2233–2271

    MathSciNet  MATH  Google Scholar 

  16. Robertson S, Zaragoza H (2007) On rank-based effectiveness measures and optimization. Inf Retr 10(3):321–339

    Article  Google Scholar 

  17. Bartlett P, Jordan M, McAuliffe J (2006) Convexity classification and risk bounds. J Am Stat Assoc 101(473):138–156

    Article  MathSciNet  MATH  Google Scholar 

  18. Bartlett P, Tewari A (2007) Sparseness vs estimating conditional probabilities: some asymptotic results. J Mach Learn Res 8:775–790

    MathSciNet  MATH  Google Scholar 

  19. Mease D, Wyner A (2008) Evidence contrary to the statistical view of boosting. J Mach Learn Res 9:131–156

    Google Scholar 

  20. Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, Berlin

  21. Clémençon S, Vayatis N (2010) Overlaying classifiers: a practical approach for optimal scoring. Constr Approx 32(3):619–648

    Article  MathSciNet  MATH  Google Scholar 

  22. Boucheron S, Bousquet O, Lugosi G (2005) Theory of classification: a survey of recent advances. ESAIM Probab Stat 9:323–375

    Article  MathSciNet  MATH  Google Scholar 

  23. anley J, McNeil J (1982) The meaning and use of the area under a ROC curve. Radiology 143:29–36

    Google Scholar 

  24. Clémençon S, Lugosi G, Vayatis N (2008) Ranking and empirical risk minimization of U-statistics. Ann Stat 36:844–874

    Article  MATH  Google Scholar 

  25. Ailon N, Mohri M (2010) Preference-based learning to rank. Mach Learn J 80(2):189–211

    Google Scholar 

  26. Breiman L, Friedman J, Olshen R, Stone C (1984) Classification, regression trees. Wadsworth and Brooks, Monterey

  27. Bach FR, Heckerman D, Eric H (2006) Considering cost asymmetry in learning classifiers. J Mach Learn Res 7:1713–1741

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We warmly thank Cynthia Rudin who kindly provided the code for the P-norm Push algorithm.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marine Depecker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Clémençon, S., Depecker, M. & Vayatis, N. An empirical comparison of learning algorithms for nonparametric scoring: the TreeRank algorithm and other methods. Pattern Anal Applic 16, 475–496 (2013). https://doi.org/10.1007/s10044-012-0299-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-012-0299-1

Keywords

Navigation