Training Classifiers for Tree-structured Categories with Partially Labeled Data

Ortega-Moral, M.; Gutiérrez-González, D.; De-Pablo, M. L.; Cid-Sueiro, J.

doi:10.1007/s11265-006-0008-7

M. Ortega-Moral¹,
D. Gutiérrez-González¹,
M. L. De-Pablo¹ &
…
J. Cid-Sueiro¹

131 Accesses
Explore all metrics

Abstract

In this paper we propose a new method for training classifiers for multi-class problems when classes are not (necessarily) mutually exclusive and may be related by means of a probabilistic tree structure. It is based on the definition of a Bayesian model relating network parameters, feature vectors and categories. Learning is stated as a maximum likelihood estimation problem of the classifier parameters. The proposed algorithm is specially suited to situations where each training sample is labeled with respect to only one or part of the categories in the tree. Our experiments on information retrieval scenarios show the advantages of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on semi-supervised learning

Article Open access 15 November 2019

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

A review of unsupervised feature selection methods

Article 29 January 2019

Notes

Our analysis in the following is based on the implicit assumption that the class observation process is independent on the class and on the value of the class label. The analysis of more complex observation processes goes beyond the scope of this paper.

References

L. Cai and T. Hoffman, “Hierarchical Document Categorization with Support Vector Machines,” in Proc. of CIKM 2004, Washington DC, USA, Nov. 2004.
J. Keshet, O. Dekel, and Y. Singer, “Large Margin Hierarchical Classification,” in Proc. of the 21st ICML, Banff, Canada, 2004.
E. D. Wiener, A. S. Weigend, and J. O. Pedersen, “Exploiting Hierarchy in Text Categorization,” Inf. Retr., vol. 1, no. 3, October 1999, pp. 193–216.
Article Google Scholar
M. E. Ruiz and P. Srinivasan, “Hierarchical Text Categorization Using Neural Networks,” Inf. Retr., vol. 5, no. 1, 2002, pp. 87–117.
Article MATH Google Scholar
A. Lagreid, T. R. Hvidsten, H. Midelfart, J. Komorowski, and A. K. Sandvik, “Predicting Gene Ontology Biological Process from Temporal Gene Expression Patterns,” Genome Res., vol. 13, no. 5, April 2003, pp. 965–979.
Article Google Scholar
O. D. King, R. E. Foulger, S. S. Dwight, J. V. White, and F. P. Roth, “Predicting Gene Funtion from Patterns of Annotation,” Genome Res., vol. 13, no. 5, April 2003, pp. 896–904.
Article Google Scholar
F. V. Jensen, Bayesian Networks and Decision Graphs, Springer, Berlin Heidelberg New York, 2001.
MATH Google Scholar
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.
M. I. Jordan and R. A. Jacobs, “Hierarchical Mixtures of Experts and the em Algorithm,” Neural Comput., vol. 6, no. 2, March 1994, pp. 181–214.
Google Scholar
D. D. Lewis, Reuters-21578 Text Categorization Test Collection, Tech. Rep., AT&T Labs–Research, 1997.
D. D. Lewis, Y. Yang, T. Rose, and F. Li, “Rcv1: A New Benchmark Collection for Text Categorization Research,” J. Mach. Learn. Res., vol. 5, no. 361, 2004, p. 397.
Google Scholar
E. Alpaydin, “Combined 5 × 2 cv f Test for Comparing Supervised Classification Learning Algorithms,” Neural Comput., vol. 11, no. 8, 1999, pp. 1885–1892.
Article Google Scholar
D. D. Lewis, “Rcv1-v2/lyrl2004: The Lyrl2004 Distribution of the Rcv1-v2 Text Categorization Test Collection,” Tech. Rep., http://www.ics.uci.edu/~kdd/databases/reuters21578/reuters21578.html, 2004.

Download references

Author information

Authors and Affiliations

Department of Signal Theory and Communications, Universidad Carlos III de Madrid, Av. de la Universidad, 30, 28911, Leganés-Madrid, Spain
M. Ortega-Moral, D. Gutiérrez-González, M. L. De-Pablo & J. Cid-Sueiro

Authors

M. Ortega-Moral
View author publications
You can also search for this author in PubMed Google Scholar
D. Gutiérrez-González
View author publications
You can also search for this author in PubMed Google Scholar
M. L. De-Pablo
View author publications
You can also search for this author in PubMed Google Scholar
J. Cid-Sueiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Ortega-Moral.

Additional information

This paper has been partially supported by Spanish MEC grants ref. TIC 2002-03713 and TEC 2005-06766-C03-02/TCM, by Madrid Chamber grant ref. S-0505/TIC/0223 and UC3M-TEC-05-027

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ortega-Moral, M., Gutiérrez-González, D., De-Pablo, M.L. et al. Training Classifiers for Tree-structured Categories with Partially Labeled Data. J VLSI Sign Process Syst Sign Im 48, 53–65 (2007). https://doi.org/10.1007/s11265-006-0008-7

Download citation

Received: 31 July 2006
Revised: 18 September 2006
Accepted: 17 October 2006
Published: 27 March 2007
Issue Date: August 2007
DOI: https://doi.org/10.1007/s11265-006-0008-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Training Classifiers for Tree-structured Categories with Partially Labeled Data

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

Learning from imbalanced data: open challenges and future directions

A review of unsupervised feature selection methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Training Classifiers for Tree-structured Categories with Partially Labeled Data

Abstract

Access this article

Similar content being viewed by others

A survey on semi-supervised learning

Learning from imbalanced data: open challenges and future directions

A review of unsupervised feature selection methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation