Multi-interval Discretization of Continuous Attributes for Label Ranking

de Sá, Cláudio Rebelo; Soares, Carlos; Knobbe, Arno; Azevedo, Paulo; Jorge, Alípio Mário

doi:10.1007/978-3-642-40897-7_11

Cláudio Rebelo de Sá^22,23,
Carlos Soares²²,
Arno Knobbe²³,
Paulo Azevedo²⁴ &
…
Alípio Mário Jorge²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8140))

Included in the following conference series:

International Conference on Discovery Science

1339 Accesses
10 Citations

Abstract

Label Ranking (LR) problems, such as predicting rankings of financial analysts, are becoming increasingly important in data mining. While there has been a significant amount of work on the development of learning algorithms for LR in recent years, pre-processing methods for LR are still very scarce. However, some methods, like Naive Bayes for LR and APRIORI-LR, cannot deal with real-valued data directly. As a make-shift solution, one could consider conventional discretization methods used in classification, by simply treating each unique ranking as a separate class. In this paper, we show that such an approach has several disadvantages. As an alternative, we propose an adaptation of an existing method, MDLP, specifically for LR problems. We illustrate the advantages of the new method using synthetic data. Additionally, we present results obtained on several benchmark datasets. The results clearly indicate that the discretization is performing as expected and in some cases improves the results of the learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Learning Preferences for Large Scale Multi-label Problems

Partial Calibrated Multi-label Ranking

Binary ranking for ordinal class imbalance

Article 07 April 2018

References

Aiguzhinov, A., Soares, C., Serra, A.P.: A similarity-based adaptation of naive bayes for label ranking: Application to the metalearning problem of algorithm recommendation. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS, vol. 6332, pp. 16–26. Springer, Heidelberg (2010)
Chapter Google Scholar
Azevedo, P.J., Jorge, A.M.: Ensembles of jittered association rule classifiers. Data Min. Knowl. Discov. 21(1), 91–129 (2010)
Article MathSciNet Google Scholar
Cheng, W., Hühn, J., Hüllermeier, E.: Decision tree and instance-based learning for label ranking. In: ICML 2009: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 161–168. ACM, New York (2009)
Google Scholar
Cheng, W., Hüllermeier, E.: Label ranking with abstention: Predicting partial orders by thresholding probability distributions (extended abstract). CoRR, abs/1112.0508 (2011)
Google Scholar
Cheng, W., Hüllermeier, E., Waegeman, W., Welker, V.: Label ranking with partial abstention based on thresholded probabilistic models. In: Advances in Neural Information Processing Systems 25, pp. 2510–2518 (2012)
Google Scholar
Chiu, D.K.Y., Cheung, B., Wong, A.K.C.: Information synthesis based on hierarchical maximum entropy discretization. J. Exp. Theor. Artif. Intell. 2(2), 117–129 (1990)
Article Google Scholar
de Sá, C.R., Soares, C., Jorge, A.M., Azevedo, P., Costa, J.: Mining association rules for label ranking. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 432–443. Springer, Heidelberg (2011)
Chapter Google Scholar
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Machine Learning - International Workshop Then Conference, pp. 194–202 (1995)
Google Scholar
Elomaa, T., Rousu, J.: Efficient multisplitting revisited: Optima-preserving elimination of partition candidates. Data Min. Knowl. Discov. 8(2), 97–126 (2004)
Article MathSciNet Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp. 1022–1029 (1993)
Google Scholar
Gurrieri, M., Siebert, X., Fortemps, P., Greco, S., Słowiński, R.: Label ranking: A new rule-based label ranking method. In: Greco, S., Bouchon-Meunier, B., Coletti, G., Fedrizzi, M., Matarazzo, B., Yager, R.R. (eds.) IPMU 2012, Part I. CCIS, vol. 297, pp. 613–623. Springer, Heidelberg (2012)
Chapter Google Scholar
Kendall, M., Gibbons, J.: Rank correlation methods. Griffin, London (1970)
Google Scholar
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: A recent survey. GESTS International Transactions on Computer Science and Engineering 32(1), 47–58 (2006)
Google Scholar
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Knowledge Discovery and Data Mining, pp. 80–86 (1998)
Google Scholar
Ribeiro, G., Duivesteijn, W., Soares, C., Knobbe, A.: Multilayer perceptron for label ranking. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012, Part II. LNCS, vol. 7553, pp. 25–32. Springer, Heidelberg (2012)
Chapter Google Scholar
Spearman, C.: The proof and measurement of association between two things. American Journal of Psychology 15, 72–101 (1904)
Article Google Scholar

Download references

Author information

Authors and Affiliations

INESCTEC, Porto, Portugal
Cláudio Rebelo de Sá & Carlos Soares
LIACS, Universiteit Leiden, Netherlands
Cláudio Rebelo de Sá & Arno Knobbe
CCTC, Departamento de Informática, Universidade do Minho, Portugal
Paulo Azevedo
DM - Faculdade de Ciencias, Universidade do Porto, Portugal
Alípio Mário Jorge

Authors

Cláudio Rebelo de Sá
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Soares
View author publications
You can also search for this author in PubMed Google Scholar
Arno Knobbe
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Azevedo
View author publications
You can also search for this author in PubMed Google Scholar
Alípio Mário Jorge
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TU Darmstadt, Germany
Johannes Fürnkranz
Phillips-Universität Marburg, Germany
Eyke Hüllermeier
The Institute of Statistical Mathematics, Tokyo, Japan
Tomoyuki Higuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Sá, C.R., Soares, C., Knobbe, A., Azevedo, P., Jorge, A.M. (2013). Multi-interval Discretization of Continuous Attributes for Label Ranking. In: Fürnkranz, J., Hüllermeier, E., Higuchi, T. (eds) Discovery Science. DS 2013. Lecture Notes in Computer Science(), vol 8140. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40897-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-40897-7_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40896-0
Online ISBN: 978-3-642-40897-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics