Weighted distance-based trees for ranking data

Plaia, Antonella; Sciandra, Mariangela

doi:10.1007/s11634-017-0306-x

Weighted distance-based trees for ranking data

Regular Article
Published: 16 December 2017

Volume 13, pages 427–444, (2019)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

541 Accesses
12 Citations
Explore all metrics

Abstract

Within the framework of preference rankings, the interest can lie in finding which predictors and which interactions are able to explain the observed preference structures, because preference decisions will usually depend on the characteristics of both the judges and the objects being judged. This work proposes the use of a univariate decision tree for ranking data based on the weighted distances for complete and incomplete rankings, and considers the area under the ROC curve both for pruning and model assessment. Two real and well-known datasets, the SUSHI preference data and the University ranking data, are used to display the performance of the methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

Preference rankings can be represented through either rank vectors (as in this paper) or order vectors (D’Ambrosio et al. 2015).
It is not the optimal tree with the best tree size but we decide to prune the tree to a size that ensures a right trade-off between tree predictive accuracy and complexity.
Data are available at http://www.kamishima.net/asset/sushi3.tgz.

References

Amodio S, D’Ambrosio A, Siciliano R (2016) Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the kemeny axiomatic approach. Eur J Oper Res 249(2):667–676. https://doi.org/10.1016/j.ejor.2015.08.048
Article MathSciNet MATH Google Scholar
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth and Brooks, London
MATH Google Scholar
Chen J, Li Y, Feng L (2012) A new weighted Spearman’s footrule as—a measure of distance between rankings. CoRR. arXiv:1207.2541
Cheng W, Hühn J, Hüllermeier E (2009) Decision tree and instance-based learning for label ranking. In Bottou L, Littman M (eds) Proceedings of the 26th international conference on machine learning. Omnipress, Montreal, pp 161–168
Cook WD (2006) Distance based and ad hoc consensus models in ordinal preference ranking. Eur J Oper Res 172:369–385. https://doi.org/10.1016/j.ejor.2005.03.048
Article MathSciNet MATH Google Scholar
Cook W, Kress M, Seiford LM (1986) An axiomatic approach to distance on partial orderings. Rev Franaise Autom Iinformatique Rech Oprationnelle Rech Oprationnelle 20(2):115–122
MathSciNet MATH Google Scholar
D’Ambrosio A (2007) Tree based methods for data editing and preference rankings. Ph.D. thesis, Universitá degli Studi di Napoli “Federico II”
D’Ambrosio A, Amodio S (2015) ConsRank: compute the median ranking(s) according to the Kemeny’s axiomatic approach. R package version 1.0.2. http://CRAN.R-project.org/package=ConsRank
D’Ambrosio A, Amodio S, Iorio C (2015) Two algorithms for finding optimal solutions of the Kemeny rank aggregation problem for full rankings. Electron J Appl Stat Anal 8(2). http://siba-ese.unisalento.it/index.php/ejasa/article/view/14986
Dittrich R, Hatzinger R, Katzenbeisser W (1998) Modelling the effect of subject-specific covariates in paired comparison studies with an application to university rankings. J R Stat Soc Ser C (Appl Stat) 47(4):511–525. https://doi.org/10.1111/1467-9876.00125
Article MATH Google Scholar
Edmond, EJ Mason DW (2000) A new technique for high level decision support. Technical Report DOR (CAM) Project Report 2000/13. https://doi.org/10.1002/meda.313
Edmond EJ, Mason DW (2002) A new rank correlation coefficient with application to the concensus ranking problem. J Multi-criteria Decision Anal 11:17–28
Article Google Scholar
Farnoud F, Touri B, Milenkovic O (2012) Novel distance measures for vote aggregation. arXiv:1203.6371
Fawcett T (2006) An introduction to roc analysis. Pattern Recogn Lett 27(8):861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Article MathSciNet Google Scholar
García-Lapresta JL, Pérez-Román D (2010) Consensus measures generated by weighted kemeny distances on weak orders. In: Proceedings of the 10th international conference on intelligent systems design and applications, Cairo
Good IJ (1980) The number of orderings of n candidates when ties and omissions are both allowed. J Stat Comput Simul 10(2):159–159. https://doi.org/10.1080/00949658008810357
Article Google Scholar
Hand DJ, Till RJ (2001) A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn 45(2):171–186. https://doi.org/10.1023/A:1010920819831
Article MATH Google Scholar
Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16):1897–1916
Article MathSciNet MATH Google Scholar
Kamishima T (2003) Nantonac collaborative filtering: recommendation based on order responses. In: 9th international conference on knowledge discovery and data mining, KDD2003, pp 583–588
Kemeny JG, Snell JL (1962) Preference rankings an axiomatic approach. MIT Press, Cambridge
Google Scholar
Kumar R, Indrayan A (2011) Receiver operating characteristic (ROC) curve for medical researchers. Indian Pediatrics 48(4):277–287. https://doi.org/10.1007/s13312-011-0055-4
Article Google Scholar
Kumar R, Vassilvitskii S (2010) Generalized distances between rankings. In: Proceedings of the 19th international conference on World Wide Web, WWW ’10, New York, NY, USA. ACM, pp 571–580. https://doi.org/10.1145/1772690.1772749
Lee PH, Yu PL (2010) Distance-based tree models for ranking data. Comput Stat Data Anal 54(6):1672–1682. https://doi.org/10.1016/j.csda.2010.01.027
Article MathSciNet MATH Google Scholar
Marcus P (2013) Comparison of heterogeneous probability models for ranking data. Master thesis. http://www.math.leidenuniv.nl/scripties/1MasterMarcus.pdf
Piccarreta R (2010) Binary trees for dissimilarity data. Comput Stat Data Anal 54(6):1516–1524. https://doi.org/10.1016/j.csda.2009.12.011
Article MathSciNet MATH Google Scholar
Ripley B (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
Book MATH Google Scholar
Sciandra M, Plaia A, Capursi V (2016) Classification trees for multivariate ordinal response: an application to student evaluation teaching. Qual Quant. https://doi.org/10.1007/s11135-016-0430-2
Google Scholar
Shih Y-S (2001) Selecting the best splits for classification trees with categorical variables. Stat Probab Lett 54(4):341–345. https://doi.org/10.1016/S0167-7152(00)00188-7
Article MathSciNet MATH Google Scholar
Strobl C, Wickelmaier F, Zeileis A (2011) Accounting for individual differences in Bradley-Terry models by means of recursive partitioning. J Educ Behav Stat 36(2):135–153. https://doi.org/10.3102/1076998609359791
Article Google Scholar
Therneau T, Clinic M (2015) User written splitting functions for Rpart. https://cran.r-project.org/web/packages/rpart/vignettes/usercode.pdf
Therneau T, Atkinson B, Ripley B (2015) Rpart: recursive partitioning and regression trees. R package version 4.1-10. http://CRAN.R-project.org/package=rpart
Yu PL, Wan, WM, Lee PH (2010) Decision tree modeling for ranking data. In: Preference learning. Springer, Berlin, pp 83–106

Download references

Acknowledgements

We would like to thanks Antonio D’Ambrosio for suggesting the SUSHI Preference data set.

Author information

Authors and Affiliations

Department of Scienze Economiche, Aziendali e Statistiche, University of Palermo, Palermo, Italy
Antonella Plaia & Mariangela Sciandra

Authors

Antonella Plaia
View author publications
You can also search for this author in PubMed Google Scholar
Mariangela Sciandra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonella Plaia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Plaia, A., Sciandra, M. Weighted distance-based trees for ranking data. Adv Data Anal Classif 13, 427–444 (2019). https://doi.org/10.1007/s11634-017-0306-x

Download citation

Received: 16 June 2016
Revised: 09 November 2017
Accepted: 11 December 2017
Published: 16 December 2017
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s11634-017-0306-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted distance-based trees for ranking data

Abstract

Access this article

Similar content being viewed by others

Position Weighted Decision Trees for Ranking Data

A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances

Decision Tree Models for Ranking Data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Weighted distance-based trees for ranking data

Abstract

Access this article

Similar content being viewed by others

Position Weighted Decision Trees for Ranking Data

A Recursive Partitioning Method for the Prediction of Preference Rankings Based Upon Kemeny Distances

Decision Tree Models for Ranking Data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation