skip to main content
10.1145/1401890.1401971acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Hypergraph spectral learning for multi-label classification

Published:24 August 2008Publication History

ABSTRACT

A hypergraph is a generalization of the traditional graph in which the edges are arbitrary non-empty subsets of the vertex set. It has been applied successfully to capture high-order relations in various domains. In this paper, we propose a hypergraph spectral learning formulation for multi-label classification, where a hypergraph is constructed to exploit the correlation information among different labels. We show that the proposed formulation leads to an eigenvalue problem, which may be computationally expensive especially for large-scale problems. To reduce the computational cost, we propose an approximate formulation, which is shown to be equivalent to a least squares problem under a mild condition. Based on the approximate formulation, efficient algorithms for solving least squares problems can be applied to scale the formulation to very large data sets. In addition, existing regularization techniques for least squares can be incorporated into the model for improved generalization performance. We have conducted experiments using large-scale benchmark data sets, and experimental results show that the proposed hypergraph spectral learning formulation is effective in capturing the high-order relations in multi-label problems. Results also indicate that the approximate formulation is much more efficient than the original one, while keeping competitive classification performance.

References

  1. S. Agarwal, K. Branson, and S. Belongie. Higher order learning with graphs. In ICML, pages 17--24, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems 14, pages 585--591, 2001.Google ScholarGoogle Scholar
  3. C. M. Bishop. Pattern Recognition and Machine Learning. Springer, New York, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757--1771, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  5. D. Cai, X. He, and J. Han. SRDA: An efficient algorithm for large-scale discriminant analysis. IEEE Transactions on Knowledge and Data Engineering, 20(1):1--12, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. O. Chapelle, B. Scholkopf, and A. Zien. Semi-Supervised Learning. MIT Press, Cambridge, MA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.Google ScholarGoogle Scholar
  8. A. Edelman, T. A. Arias, and S. T. Smith. The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl., 20(2):303--353, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Elisseeff and J. Weston. A kernel method for multi-labelled classification. In Advances in Neural Information Processing Systems 14, pages 681--687, 2001.Google ScholarGoogle Scholar
  10. G. H. Golub and C. F. V. Loan. Matrix Computations. Johns Hopkins Press, Baltimore, MD, 1996.Google ScholarGoogle Scholar
  11. D. R. Hardoon, S. R. Szedmak, and J. R. Shawe-taylor. Canonical correlation analysis: An overview with application to learning methods. Neural Comput., 16(12):2639--2664, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer, New York, NY, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  13. A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55--67, 1970.Google ScholarGoogle ScholarCross RefCross Ref
  14. H. Hotelling. Relations between two sets of variables. Biometrika, 28:312--377, 1936.Google ScholarGoogle ScholarCross RefCross Ref
  15. F. Kang, R. Jin, and R. Sukthankar. Correlated label propagation with application to multi-label learning. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1719--1726, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Kazawa, T. Izumitani, H. Taira, and E. Maeda. Maximal margin labeling for multi-topic text categorization. In Advances in Neural Information Processing Systems 17, pages 649--656. 2005.Google ScholarGoogle Scholar
  17. W. Noble. Support vector machine applications in computational biology. Kernel Methods in Computational Biology. B. Schoelkopf, K. Tsuda and J.-P. Vert, ed. MIT Press, pages 71--92, 2004.Google ScholarGoogle Scholar
  18. C. C. Paige and M. A. Saunders. LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Transactions on Mathematical Software, 8(1):43--71, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. A. Rodriguez. On the laplacian spectrum and walk-regular hypergraphs. Linear and Multilinear Algebra,:285--297, 2003.Google ScholarGoogle Scholar
  20. A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):854--869, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. N. Ueda and K. Saito. Parametric mixture models for multi-labeled text. In Advances in Neural Information Processing Systems 16, pages 721--728. 2003.Google ScholarGoogle Scholar
  22. R. Yan, J. Tesic, and J. R. Smith. Model-shared subspace boosting for multi-label classification. In Proceedings of the thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 834--843, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In ICML, pages 412--420, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Ye. Least squares linear discriminant analysis. In ICML, pages 1087--1094, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Yu, K. Yu, V. Tresp, and H.-P. Kriegel. Multi-output regularized feature projection. IEEE Transactions on Knowledge and Data Engineering, 18(12):1600--1613, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M.-L. Zhang and Z.-H. Zhou. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10):1338--1351, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M.-L. Zhang and Z.-H. Zhou. Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038--2048, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Zhou, J. Huang, and B. Scholkopf. Learning with hypergraphs: Clustering, classification, and embedding. In Advances in Neural Information Processing Systems 19, pages 1601--1608, 2007.Google ScholarGoogle Scholar
  29. J. Zien, M. Schlag, and P. Chan. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 18(9):1389--1399, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hypergraph spectral learning for multi-label classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2008
      1116 pages
      ISBN:9781605581934
      DOI:10.1145/1401890
      • General Chair:
      • Ying Li,
      • Program Chairs:
      • Bing Liu,
      • Sunita Sarawagi

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 August 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '08 Paper Acceptance Rate118of593submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader