An Empirical Comparison of Flat and Hierarchical Performance Measures for Multi-Label Classification with Hierarchy Extraction

Brucker, Florian; Benites, Fernando; Sapozhnikova, Elena

doi:10.1007/978-3-642-23851-2_59

Florian Brucker²⁵,
Fernando Benites²⁵ &
Elena Sapozhnikova²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6881))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

1436 Accesses

Abstract

Multi-label Classification (MC) often deals with hierarchically organized class taxonomies. In contrast to Hierarchical Multi-label Classification (HMC), where the class hierarchy is assumed to be known a priori, we are interested in the opposite case where it is unknown and should be extracted from multi-label data automatically. In this case the predictive performance of a classifier can be assessed by well-known Performance Measures (PMs) used in flat MC such as precision and recall. The fact that these PMs treat all class labels as independent labels, in contrast to hierarchically structured taxonomies, is a problem. As an alternative, special hierarchical PMs can be used that utilize hierarchy knowledge and apply this knowledge to the extracted hierarchy. This type of hierarchical PM has only recently been mentioned in literature. The aim of this study is first to verify whether HMC measures do significantly improve quality assessment in this setting. In addition, we seek to find a proper measure that reflects the potential quality of extracted hierarchies in the best possible way. We empirically compare ten hierarchical and four traditional flat PMs in order to investigate relations between them. The performance measurements obtained for predictions of four multi-label classifiers ML-ARAM, ML-kNN, BoosTexter and SVM on four datasets from the text mining domain are analyzed by means of hierarchical clustering and by calculating pairwise statistical consistency and discriminancy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Evaluation of Different Data-Derived Label Hierarchies in Multi-label Classification

The use of data-derived label hierarchies in multi-label classification

Article 18 April 2016

Structuring the Output Space in Multi-label Classification by Using Feature Ranking

References

Benites, F., Brucker, F., Sapozhnikova, E.: Multi-Label Classification by ART-based Neural Networks and Hierarchy Extraction. In: Proc. of the IEEE IJCNN 2010, pp. 2788–2796. IEEE Computer Society, Barcelona (2010)
Google Scholar
Brucker, F., Benites, F., Sapozhnikova, E.: Multi-label classification and extracting predicted class hierarchies. Pattern Recognition 44(3), 724–738 (2011)
Article MATH Google Scholar
Cai, L., Hofmann, T.: Exploiting known taxonomies in learning overlapping concepts. In: Proc. of Int. Joint Conf. on Artificial Intelligence (2007)
Google Scholar
Cesa-Bianchi, N., Gentile, C., Zaniboni, L.: Hierarchical classification: combining Bayes with SVM. In: Proc. of the 23rd Int. Conf. on Machine learning (2006)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm (acc. 03.2010)
Costa, E., Lorena, A., Carvalho, A., Freitas, A.: A review of performance evaluation measures for hierarchical classifiers. In: Proc. of the AAAI 2007 Workshop: Evaluation Methods for Machine Learning II, pp. 1–6 (2007)
Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proc. of the 23rd Int. Conf. on Machine Learning, p. 240. ACM, New York (2006)
Google Scholar
Granitzer, M.: Hierarchical text classification using methods from machine learning. Master’s thesis, Graz University of Technology (2003)
Google Scholar
Huang, J., Ling, C.: Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 299–310 (2005)
Google Scholar
Ipeirotis, P., Gravano, L., Sahami, M.: Probe, Count, and Classify: Categorizing Hidden-Web Databases. In: Proc. of the 2001 ACM SIGMOD Int. Conf. on Management of Data, pp. 67–78 (2001)
Google Scholar
Kiritchenko, S.: Hierarchical text categorization and its application to bioinformatics. Ph.D. thesis, University of Ottawa Ottawa, Ont., Canada (2006)
Google Scholar
Nowak, S., Lukashevich, H.: Multilabel classification evaluation using ontology information. In: Proc. of the First ESWC Workshop on Inductive Reasoning and Machine Learning on the Semantic Web, Heraklion, Greece (2009)
Google Scholar
Silla, C., Freitas, A.: A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery, 1–42 (2010)
Google Scholar
Struyf, J., Dzeroski, S., Blockeel, H., Clare, A.: Hierarchical multi-classification with predictive clustering trees in functional genomics. In: Bento, C., Cardoso, A., Dias, G. (eds.) EPIA 2005. LNCS (LNAI), vol. 3808, pp. 272–283. Springer, Heidelberg (2005)
Chapter Google Scholar
Sun, A., Lim, E.: Hierarchical text classification and evaluation. In: Proc. of the 2001 IEEE Int. Conf. on Data Mining, California, USA, vol. 528 (2001)
Google Scholar
Tan, P., Steinbach, M., Kumar, V.: Introduction to data mining. Pearson Addison Wesley, Boston (2006)
Google Scholar
Verspoor, K., Cohn, J., Mniszewski, S., Joslyn, C.: A categorization approach to automated ontological function annotation. Protein Science 15(6), 1544–1549 (2006)
Article Google Scholar
Wang, K., Zhou, S., He, Y.: Hierarchical classification of real life documents. In: Proc. of the 1st (SIAM) Int. Conf. on Data Mining, pp. 1–16 (2001)
Google Scholar
Woolam, C., Khan, L.: Multi-concept document classification using a perceptron-like algorithm. In: WI-IAT 2008: Proc. of the 2008 IEEE/WIC/ACM Int. Conf. on Web Intelligence and Intelligent Agent Technology, pp. 570–574. IEEE Computer Society, Washington, DC, USA (2008)
Chapter Google Scholar
Wu, F., Zhang, J., Honavar, V.: Learning Classifiers Using Hierarchically Structured Class Taxonomies. In: Zucker, J.-D., Saitta, L. (eds.) SARA 2005. LNCS (LNAI), vol. 3607, pp. 313–320. Springer, Heidelberg (2005)
Chapter Google Scholar
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval 1(1), 69–90 (1999)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Science, University of Konstanz, Germany
Florian Brucker, Fernando Benites & Elena Sapozhnikova

Authors

Florian Brucker
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Benites
View author publications
You can also search for this author in PubMed Google Scholar
Elena Sapozhnikova
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Integrated Sensor Systems, University of Kaiserslautern, Erwin-Schroedinger-str. 12, 67663, Kaiserslautern, Germany
Andreas König
Knowledge-Based Systems Group, Department of omputer Science, University of Kaiserslautern, P.O. Box 3049, 67653, Kaiserslautern, Germany
Andreas Dengel
School of Business, University of Applied Sciences Northwestern Switzerland, Riggenbachstr. 16, 4600, Olten, Switzerland
Knut Hinkelmann
Graduate School of Engineering, Osaka Prefecture University, 1-1 Gakuen-cho, 599-8531, Sakai, Osaka, Japan
Koichi Kise
KES International, P.O. Box 2115, BN43 9AF, Shoreham-by-sea, UK
Robert J. Howlett
University of South Australia, Mawson Lakes, 5095, Adelaide, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brucker, F., Benites, F., Sapozhnikova, E. (2011). An Empirical Comparison of Flat and Hierarchical Performance Measures for Multi-Label Classification with Hierarchy Extraction. In: König, A., Dengel, A., Hinkelmann, K., Kise, K., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2011. Lecture Notes in Computer Science(), vol 6881. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23851-2_59

Download citation

DOI: https://doi.org/10.1007/978-3-642-23851-2_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23850-5
Online ISBN: 978-3-642-23851-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics