Abstract
In this survey we discuss the task of hierarchical classification. The literature about this field is scattered across very different application domains and for that reason research in one domain is often done unaware of methods developed in other domains. We define what is the task of hierarchical classification and discuss why some related tasks should not be considered hierarchical classification. We also present a new perspective about some existing hierarchical classification approaches, and based on that perspective we propose a new unifying framework to classify the existing approaches. We also present a review of empirical comparisons of the existing methods reported in the literature as well as a conceptual comparison of those methods at a high level of abstraction, discussing their advantages and disadvantages.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Aleksovski D, Kocev D, Dzeroski S (2009) Evaluation of distance measures for hierarchical multilabel classification in functional genomics. In: Proceedings of the 1st workshop on learning from multi-label data (MLD) held in conjunction with ECML/PKDD, pp 5–16
Altun Y, Hofmann T (2003) Large margin methods for label sequence learning. In: Proceedings of the 8th European conference on speech communication and technology (EuroSpeech)
Alves RT, Delgado MR, Freitas AA (2008) Multi-label hierarchical classification of protein functions with artificial immune systems. In: Advances in bioinformatics and computational biology, Lecture notes in bioinformatics, vol 5167. Springer, Berlin, pp 1–12
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology consortium. Gene ontology: tool for the unification of biology. Nat Genet 25: 25–29
Astikainen K, Holmand L, Pitkanen E, Szedmak S, Rousu J (2008) Towards structured output prediction of enzyme function. BMC Proc 2(Suppl 4)
Barbedo JGA, Lopes A (2007) Automatic genre classification of musical signals. EURASIP J Adv Signal Process 2007: 12
Barret AJ (1997) Nomenclature committee of the international union of biochemistry and molecular biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions. Eur J Biochem 250(1): 1–6
Barutcuoglu Z, DeCoro C (2006) Hierarchical shape classification using bayesian aggregation. In: Proceedings of the IEEE conference on shape modeling and applications
Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Syst Biol 22: 830–836
Bennett PN, Nguyen N (2009) Refined experts: improving classification in large taxonomies. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, pp 11–18
Binder A, Kawanabe M, Brefeld U (2009) Efficient classification of images with taxonomies. In: Proceedings of the 9th Asian conference on computer vision
Blockeel H, Bruynooghe M, Dzeroski S, Ramon J, Struyf J (2002) Hierarchical multi-classification. In: Proceedings of the first SIGKDD workshop on multirelational data mining (MRDM-2002), pp 21–35
Blockeel H, Schietgat L, Struyf J, Džeroski S, Clare A (2006) Decision trees for hierarchical multilabel classification: a case study in functional genomics. In: Knowledge discovery in databases: PKDD 2006. Lecture notes in computer science, vol 4213. Springer, Berlin, pp 18–29
Brecheisen S, Kriegel HP, Kunath P, Pryakhin A (2006a) Hierarchical genre classification for large music collections. In: Proceedings of the IEEE 7th international conference on Multimedia & Expo, pp 1385–1388
Brecheisen S, Kriegel HP, Kunath P, Pryakhin A, Vorberger F (2006b) MUSCLE: music classification engine with user feedback. In: Springer (ed) Proceedings of the 10th international conference on extending database technology, vol 3896 in Lecture notes in computer science, pp 1164–1167
Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B (2005) A database of German emotional speech. In: Proceedings of the 9th European conference on speech communication and technology, pp 1517–1520
Burred JJ, Lerch A (2003) A hierarchical approach to automatic musical genre classification. In: Proceedings of the 6th international conference on digital audio effects, pp 8–11
Cai L, Hofmann T (2004) Hierarchical document categorization with support vector machines. In: Proceedings of the 13th ACM international conference on information and knowledge management, pp 78–87
Cai L, Hofmann T (2007) Exploiting known taxonomies in learning overlapping concepts. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 714–719
Ceci M (2008) Hierarchical text categorization in a transductive setting. In: Proceedings of the IEEE international conference of data mining workshops, pp 184–191
Ceci M, Malerba D (2007) Classifying web documents in a hierarchy of categories: a comprehensive study. J Intell Inform Syst 28(1): 1–41
Cesa-Bianchi N, Valentini G (2009) Hierarchical cost-sensitive algorithms for genome-wide gene function prediction. In: Third international workshop on machine learning in systems biology
Cesa-Bianchi N, Gentile C, Zaniboni L (2006a) Hierarchical classification: combining Bayes with SVM. In: Proceedings of the 23rd international conference on machine learning, pp 177–184
Cesa-Bianchi N, Gentile C, Zaniboni L (2006b) Incremental algorithms for hierarchical classification. J Mach Learn Res 7: 31–54
Chakrabarti S, Dom B, Agrawal R, Raghavan P (1998) Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies. VLDB J 7: 163–178
Chen Y, Crawford MM, Ghosh J (2004) Integrating support vector machines in a hierarchical output space decomposition framework. In: Proceedings of the IEEE international symposium on geoscience and remote sensing, vol 2, pp 949–952
Clare A (2004) Machine learning and data mining for yeast functional genomics. PhD thesis, University of Wales Aberystwyth
Clare A, King RD (2003) Predicting gene function in Saccharomyces cerevisiae. Bioinformatics 19(suppl 2): ii42–ii49
Costa E, Lorena A, Carvalho A, Freitas A (2007a) A review of performance evaluation measures for hierarchical classifiers. In: Evaluation methods for machine learning II: papers from the 2007 AAAI Workshop, AAAI Press, pp 1–6
Costa E, Lorena A, Carvalho A, Freitas AA, Holden N (2007b) Comparing several approaches for hierarchical classification of proteins with decision trees. In: Advances in bioinformatics and computational biology, Lecture notes in bioinformatics, vol 4643. Springer, Berlin, pp 126–137
Costa EP, Lorena AC, de Carvalho A, Freitas AA (2008) Top-down hierarchical ensembles of classifiers for predicting g-protein-coupled-receptor functions. In: Advances in Bioinformatics and computational biology. Lecture notes in bioinformatics, vol 5167. Springer, Berlin, pp 35–46
D’Alessio S, Murray K, Schiaffino R, Kershenbaum A (2000) The effect of using hierarchical classifiers in text categorization. In: Proceedings of the 6th international conference Recherche d´ Information Assistee par Ordinateur, pp 302–313
DeCoro C, Barutcuoglu Z, Fiebrink R (2007) Bayesian aggregation for hierarchical genre classification. In: Proceedings of the 8th international conference on music information retrieval, Vienna, Austria, pp 77–80
Dekel O, Keshet J, Singer Y (2004a) Large margin hierarchical classification. In: Proceedings of the 21th international conference on Machine learning
Dekel O, Keshet J, Singer Y (2004b) An online algorithm for hierarchical phoneme classification. In: Proceedings of the 1st machine learning for multimodal interaction workshop. Lecture notes in computer science, vol 3361. Springer, Berlin, pp 146–158
Dimitrovski I, Kocev D, Loskovska S, Dzeroski S (2008) Hierarchical annotation of medical images. In: Proceedings of the 11th international multiconference information society, vol A, pp 174–177
Downie JS, Cunningham SJ (2002) Toward a theory of music information retrieval queries: System design implications. In: Proceedings of the 3rd international conference on music information retrieval, pp 299–300
Dumais ST, Chen H (2000) Hierarchical classification of Web content. In: Belkin NJ, Ingwersen P, Leong MK (eds) Proceedings of the 23rd ACM international conference on research and development in information retrieval, pp 256–263
Eisner R, Poulin B, Szafron D, Lu P, Greiner R (2005) Improving protein function prediction using the hierarchical structure of the gene ontology. In: Proceedings of the IEEE symposium on computational intelligence in bioinformatics and computational biology, pp 1–10
Esuli A, Fagni T, Sebastiani F (2008) Boosting multi-label hierarchical text categorization. Inform Retr 11(4): 287–313
Fagni T, Sebastiani F (2007) On the selection of negative examples for hierarchical text categorization. In: Proceedings of the 3rd language technology conference, pp 24–28
Freitas AA, de Carvalho ACPLF (2007) Research and trends in data mining technologies and applications, Idea Group, chap A: tutorial on hierarchical classification with applications in bioinformatics, pp 175–208
Freitas COA, Oliveira LS, Aires SBK, Bortolozzi F (2008) Metaclasses and zoning mechanism applied to handwriting recognition. J Univers Comput Sci 14(2): 211–223
García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9: 2677–2694
Gauch S, Chandramouli A, Ranganathan S (2009) Training a hierarchical classifier using inter document relationships. J Am Soc Inform Sci Technol 60(1): 47–58
Gerlt JA, Babbitt PC (2000) Can sequence determine function. Genome Biol 1(5): 1–10
Guan Y, Myers CL, Hess DC, Barutcuoglu Z, Caudy AA, Troyanskaya OG (2008) Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biol 9(Suppl 1):S3
Hao PY, Chiang JH, Tu YK (2007) Hierarchically SVM classification based on support vector clustering method and its application to document categorization. Expert Syst Appl 33: 627–635
Hayete B, Bienkowska J (2005) Gotrees: predicting go associations from protein domain composition using decision trees. In: Proceedings of the Pacific symposium on biocomputing, pp 127–138
Holden N, Freitas AA (2005) A hybrid particle swarm/ant colony algorithm for the classification of hierarchical biological data. In: Proceedings of the 2nd IEEE swarm intelligence symposium, pp 100–107
Holden N, Freitas AA (2006) Hierarchical classification of g-protein-coupled receptors with a pso/aco algorithm. In: Proceedings of the 3rd IEEE swarm intelligence symposium, pp 77–84
Holden N, Freitas AA (2008) Improving the performance of hierarchical classification with swarm intelligence. In: Proc. 6th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBio). Lecture notes in computer science, vol 4973. Springer, Berlin, pp 48–60
Holden N, Freitas AA (2009) Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation. Soft Comput J 13: 259–272
Jin B, Muller B, Zhai C, Lu X (2008) Multi-label literature classification based on the gene ontology graph. BMC Bioinform 9:525
Kiritchenko S, Matwin S, Famili AF (2005) Functional annotation of genes using hierarchical text categorization. In: Proceedings of the ACL workshop on linking biological literature, ontologies and databases: mining biological semantics
Kiritchenko S, Matwin S, Nock R, Famili AF (2006) Learning and evaluation in the presence of class hierarchies: application to text categorization. In: Proceedings of the 19th Canadian conference on artificial intelligence. Lecture notes in artificial intelligence, vol 4013, pp 395–406
Koerich AL, Kalva PR (2005) Unconstrained handwritten character recognition using metaclasses of characters. In: Proceedings of the IEEE international conference on image processing, vol 2, pp 542–545
Koller D, Sahami M (1997) Hierarchically classifying documents using very few words. In: Proceedings of the 14th international conference on machine learning, pp 170–178
Kriegel HP, Kroger P, Pryakhin A, Schubert M (2004) Using support vector machines for classifying large sets of multi-represented objects. In: Proceedings of the SIAM international conference on data mining, pp 102–114
Kumar S, Ghosh J, Crawford MM (2002) Hierarchical fusion of multiple classifiers for hyperspectral data analysis. Pattern Anal Appl 5: 210–220
Labrou Y, Finin T (1999) Yahoo! as an ontology—using yahoo! categories to describe documents. In: Proceedings of the ACM conference on information and knowledge management, pp 180–187
Lee JH, Downie JS (2004) Survey of music information needs, uses, and seeking behaviours: preliminary findings. In: Proceedings of the fifth international conference on music information retrieval, Barcelona, Spain, pp 441–446
Li T, Ogihara M (2005) Music genre classification with taxonomy. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, pp 197–200
Li T, Zhu S, Ogihara M (2007) Hierarchical document classification using automatically generated hierarchy. J Intell Inform Syst 29(2): 211–230
Liu TY, Yang Y, Wan H, Zeng HJ, Chen Z, Ma WY (2005) Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explor Newsl 7(1): 36–43
Lorena AC, Carvalho ACPLF (2004) Comparing techniques for multiclass classification using binary svm predictors. In: Proceedings of the IV Mexican international conference on artificial intelligence. Lecture notes in artificial intelligence, vol 2972, pp 272–281
McCallum A, Rosenfeld R, Mitchell TM, Ng AY (1998) Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of the international conference on machine learning, pp 359–367
McKay C, Fujinaga I (2004) Automatic genre classification using large high-level musical feature sets. In: Proceedings of the international conference on music information retrieval, pp 525–530
Mladenic D, Grobelnik M (2003) Feature selection on hierarchy of web documents. Decis Support Syst 35: 45–87
Otero FEB, Freitas AA, Johnson CG (2009) A hierarchical classification ant colony algorithm for predicting gene ontology terms. In: Pizzuti C, Ritchie M, Giacobini M (eds) Proceedings of the 7th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBio). Lecture Notes in Computer Science, vol 5483. Springer, Berlin, pp 68–79
Peng X, Choi B (2005) Document classifications based on word semantic hierarchies. In: Proceedings of the international conference on artificial intelligence and applications, pp 362–367
Punera K, Ghosh J (2008) Enhanced hierarchical classification via isotonic smoothing. In: Proceedings of the 17th international conference on World Wide Web, pp 151–160
Punera K, Rajan S, Ghosh J (2005) Automatically learning document taxonomies for hierarchical classification. In: Proceedings of the international World Wide Web conference, pp 1010–1011
Qiu X, Gao W, Huang X (2009) Hierarchical multi-class text categorization with global margin maximization. In: Proceedings of the Joint conference of the 47th Annual Meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, Association for computational linguistics, pp 165–168
Rocchio JJ (1971) The SMART retrieval system: experiments in automatic document processing, chap: relevance feedback in information retrieval, Prentice Hall, pp 313–323
Rousu J, Saunders C, Szedmak S, Shawe-Taylor J (2005) Learning hierarchical multi-category text classification models. In: Proceedings of the 22nd international conference on machine learning, pp 744–751
Rousu J, Saunders C, Szedmak S, Shawe-Taylor J (2006) Kernel-based learning of hierarchical multilabel classification models. J Mach Learn Res 7: 1601–1626
Ruepp A, Zollner A, Maier D, Albermann K, Hani J, Mokrejs M, Tetko I, Guldener U, Mannhaupt G, Munsterkotter M, Mewes HW (2004) The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res 32(18): 5539–5545
Ruiz ME, Srinivasan P (2002) Hierarchical text categorization using neural networks. Inform Retr 5: 87–118
Sasaki M, Kita K (1998) Rule-based text categorization using hierarchical categories. In: Proceedings of IEEE international conference on systems, man, and cybernetics, pp 2827–2830
Secker A, Davies M, Freitas A, Timmis J, Mendao M, Flower D (2007) An experimental comparison of classification algorithms for the hierarchical prediction of protein function. Expert Updat (the BCS-SGAI Mag) 9(3): 17–22
Secker A, Davies M, Freitas AA, Clark E, Timmis J, Flower DR (2010) Hierarchical classification of g-protein-coupled-receptors with data-driven selection of attributes and classifiers. Int J Data Mining Bioinform 4(2): 191–210
Seeger MW (2008) Cross-validation optimization for large scale structured classification kernel methods. J Mach Learn Res 9: 1147–1178
Shilane P, Kazhdan M, Min P, Funkhouser T (2004) The Princeton shape benchmark. In: Proceedings of the shape modeling international
Silla Jr CN, Freitas AA (2009a) A global-model naive bayes approach to the hierarchical prediction of protein functions. In: Proceedings of the 9th IEEE international conference on data mining, pp 992–997
Silla Jr CN, Freitas AA (2009b) Novel top-down approaches for hierarchical classification and their application to automatic music genre classification. In: Proceedings of the IEEE international conference on systems, man, and cybernetics, pp 3599–3604
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inform Process Manag 45: 427–437
Sun A, Lim EP (2001) Hierarchical text classification and evaluation. In: Proceedings of the IEEE international conference on data mining, pp 521–528
Sun A, Lim EP, Ng WK (2003) Performance measurement framework for hierarchical text classification. J Am Soc Inform Sci Technol 54(11): 1014–1028
Sun A, Lim EP, Ng WK, Srivastava J (2004) Blocking reduction strategies in hierarchical text classification. IEEE Trans Knowl Data Eng 16(10): 1305–1308
Tikk D, Biró G (2003) Experiment with a hierarchical text categorization method on the wipo-alpha patent collection. In: Proceedings of the 4th international symposium on uncertainty modeling and analysis, pp 104–109
Tikk D, Yang JD, Bang SL (2003) Hierarchical text categorization using fuzzy relational thesaurus. Kybernetika 39(5): 583–600
Tikk D, Biró G, Yang JD (2004) A hierarchical text categorization approach and its application to frt expansion. Aust J Intell Inform Process Syst 8(3): 123–131
Tikk D, Biró G, Torcsvári A (2007) Emerging technologies of text mining: techniques and applications, Idea Group, chap: a hierarchical online classifier for patent categorization, pp 244–267
Tsoumakas G, Katakis I (2007) Multi label classification: an overview. Int J Data Wareh Mining 3(3): 1–13
Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6: 1453–1484
Valentini G (2009) True path rule hierarchical ensembles. In: Kittler J, Benediktsson J, Roli F (eds) Proceedings of the eighth international workshop on multiple classifier systems. Lecture notes in computer science, vol 5519. Springer, Berlin, pp 232–241
Valentini G, Re M (2009) Weighted true path rule: a multilabel hierarchical algorithm for gene function prediction. In: Proceedings of the 1st workshop on learning from multi-label data (MLD) held in conjunction with ECML/PKDD, pp 132–145
Vens C, Struyf J, Schietgat L, Džeroski S, Blockeel H (2008) Decision trees for hierarchical multi-label classification. Mach Learn 73(2): 185–214
Wang K, Zhou S, Liew SC (1999) Building hierarchical classifiers using class proximity. In: In Proceedings of the 25th conference on very large data base. Morgan Kaufmann Publishers, San Francisco, pp 363–374
Wang K, Zhou S, He Y (2001) Hierarchical classification of real life documents. In: Proceedings of the 1st SIAM international conference on data mining, Chicago, USA
Wang J, Shen X, Pan W (2009) Large margin hierarchical classification with multiple paths. J Am Stat Assoc 104(487): 1213–1223
Weigend AS, Wiener ED, Pedersen JO (1999) Exploiting hierarchy in text categorization. Inform Retr 1: 193–216
Wu F, Zhang J, Honavar V (2005) Learning classifiers using hierarchically structured class taxonomies. In: Proceedings of the symposium on abstraction, reformulation, and approximation, vol 3607. Springer, Berlin, pp 313–320
Xiao Z, Dellandréa E, Dou W, Chen L (2007) Hierarchical Classification of Emotional Speech. Technical report RR-LIRIS-2007-006, LIRIS UMR 5205 CNRS/INSA de Lyon/Université Claude Bernard Lyon 1/Université Lumière Lyon 2/Ecole Centrale de Lyon, http://liris.cnrs.fr/publis/?id=2742
Xue GR, Xing D, Yang Q, Yu Y (2008) Deep classification in large-scale text hierarchies. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, pp 619–626
Zhang T (2003) Semi-automatic approach for music classification. In: Proceedings of the SPIE conference on internet multimedia management systems, pp 81–91
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Silla, C.N., Freitas, A.A. A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22, 31–72 (2011). https://doi.org/10.1007/s10618-010-0175-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-010-0175-9