Multi-label classification via learning a unified object-label graph with sparse representation

Yao, Lina; Sheng, Quan Z.; Ngu, Anne H. H.; Gao, Byron J.; Li, Xue; Wang, Sen

doi:10.1007/s11280-015-0376-7

Multi-label classification via learning a unified object-label graph with sparse representation

Published: 27 November 2015

Volume 19, pages 1125–1149, (2016)
Cite this article

World Wide Web Aims and scope Submit manuscript

Lina Yao¹,
Quan Z. Sheng¹,
Anne H. H. Ngu²,
Byron J. Gao²,
Xue Li³ &
…
Sen Wang³

775 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

Automatic annotation is an essential technique for effectively handling and organizing Web objects (e.g., Web pages), which have experienced an unprecedented growth over the last few years. Automatic annotation is usually formulated as a multi-label classification problem. Unfortunately, labeled data are often time-consuming and expensive to obtain. Web data also accommodate much richer feature space. This calls for new semi-supervised approaches that are less demanding on labeled data to be effective in classification. In this paper, we propose a graph-based semi-supervised learning approach that leverages random walks and ℓ ₁ sparse reconstruction on a mixed object-label graph with both attribute and structure information for effective multi-label classification. The mixed graph contains an object-affinity subgraph, a label-correlation subgraph, and object-label edges with adaptive weight assignments indicating the assignment relationships. The object-affinity subgraph is constructed using ℓ ₁ sparse graph reconstruction with extracted structural meta-text, while the label-correlation subgraph captures pairwise correlations among labels via linear combination of their co-occurrence similarity and kernel-based similarity. A random walk with adaptive weight assignment is then performed on the constructed mixed graph to infer probabilistic assignment relationships between labels and objects. Extensive experiments on real Yahoo! Web datasets demonstrate the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label classification via closed frequent labelsets and label taxonomies

Article 14 April 2023

A Graph-based Semi-supervised Multi-label Learning Method Based on Label Correlation Consistency

Article 31 August 2021

Semi-supervised multi-label feature selection with local logic information preserved

Article 06 September 2021

Notes

References

Aggarwal, C.C., Zhai, C.: A survey of text clustering algorithms. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp 163–222. Springer (2012)
Backstromm, L., Leskovec, J.: Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM 2011), 635–644. ACM, 2011
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Article MATH Google Scholar
Chen, G., Song, Y., Wang, F., Zhang, C.: Semi-supervised multi-label learning by solving a sylvester equation. In: SIAM International Conference on Data Mining, 410–419 (2008)
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)
Article MathSciNet MATH Google Scholar
Cheng, B., Yang, J., Yan, S, Fu, Y., Huang, T. S.: Learning with l1-graph for image analysis. IEEE Transactions of Image Processing 19(4) (2010)
Donoho, D.L.: For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Commun. Pur. Appl. Math. 59(6), 797–829 (2006)
Article MathSciNet MATH Google Scholar
Enhong, C., Lin, Y., Xiong, H., Luo, Q., Ma, H.: Exploiting probabilistic topic models to improve text categorization under class imbalance. Information Processing Management 47(2), 202–214 (2011)
Article Google Scholar
Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), pages 2790–2797. IEEE (2009)
Fouss, F., Pirotte, A., Renders, J.-M., Saerens, M.: Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19(3), 355–369 (2007)
Article Google Scholar
Ghamrawi, N., McCallum, A.: Collective multi-label classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM 2005), pages 195–200. ACM (2005)
Guo, Y., Dale, S.: Semi-supervised multi-label classification: a simultaneous large-margin, subspace learning approach. In: Proceedings of the European Conference on Machine Learning (ECML 2012), Bristol,UK, p 2012
Hardiman, S.J., Katzir, L.: Estimating clustering coefficients and size of social networks via random walk. In: Proc. of the 22nd International World Wide Web Conference (WWW 2013), Rio de Janeiro, Brazil (2013)
Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), 593–598. ACM (2004)
Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. Springer (1998)
Kim, H., Park, H.: Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12), 1495–1502 (2007)
Article Google Scholar
Kotropoulos, Y.P.C., Arce, G.R.: l1-graph based music structure analysis. In: Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011) (2011)
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)
Article Google Scholar
Liu, D., Hua, X.-S., Yang, L., Wang, M., Zhang, H.-J.: Tag ranking. In: Proceedings of the 18th international conference on World wide Web, pp. 351–360. ACM (2009)
Liu, G., Yan, S.: Latent low-rank representation for subspace segmentation and feature extraction. In: IEEE International Conference on Computer Vision (ICCV 2011), 1615–1622. IEEE (2011)
Liu, W., He, J., Chang, S.-F.: Large graph construction for scalable semi-supervised learning. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 679–686 (2010)
Liu, Y., Jin, R., Yang, L.: Semi-supervised multi-label learning by constrained non-negative matrix factorization. In: Proceedings of the 21st AAAI Conference on Artificial Intelligence (AAAI 2006), AAAI Press (2006)
Long, M., Wang, J., Ding, G., Shen, D., Yang, Q.: Transfer learning with graph co-regularization. Transactions on knowledge and data engineering. in press (2013)
Macskassy, S.A.: Improving learning in networked data by combining explicit and mined links. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence (AAAI 2007), vol. 22, p. 590. AAAI Press (2007)
Macskassy, S.A., Provost, F.: Classifier: A simple relational. Technical report. DTIC Document (2003)
Macskassy, S.A., Provostl, F.: Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8, 935–983 (2007)
Google Scholar
Pan, J.Y., Yang, H.J., Faloutsos, C., Duygulu, P.: Gcap: Graph-based automatic image captioning. In: Computer Vision and Pattern Recognition Workshop, CVPRW’04. Conference on, pp. 146–146. IEEE (2004)
Pan, J.Y., Yang, H.J., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discover. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), pp. 653–658. ACM (2004)
Qi, X., Davison, B.D.: Web page classification: Features and algorithms. ACM Comput. Surv. (CSUR) 41(2), 12 (2009)
Article Google Scholar
Qiao, L., Chen, S., Tan, X.: Sparsity preserving projections with applications to face recognition. Pattern Recog. 43(1), 331–341 (2010)
Article MATH Google Scholar
Ramage, D., Manning, C.D., Dumais, S.: Partially labeled topic models for interpretable text mining. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2011), pp. 457–465. ACM (2011)
Rosenfeld, A., Hummel, R.A., Zucker, S.W.: Scene labeling by relaxation operations. IEEE Trans. Syst. Man Cybern. 6, 420–433 (1976)
Article MathSciNet MATH Google Scholar
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Sujatha Das, G., Caragea, C., Mitra, P., Lee Giles, C.: Researcher Homepage Classification using Unlabeled Data. In: Proc. of the 22nd International World Wide Web Conference (WWW 2013), Rio de Janeiro, Brazil (2013)
Tang, L., Rajan, S., Narayanan, V.K.: Large scale multi-label classification via metalabeler. In: Proceedings of the 18th International Conference on World Wide Web (WWW’09), pp. 211–220. ACM (2009)
Tong, H., Faloutsos, C., Pan, J.Y.: Fast random walk with restart and its applications. In: Proceedings of the 6th International Conference on Data Mining (ICDM 2006), Hong Kong, China, December, p 2006
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)
Article Google Scholar
Ueda, N., Saito, K.: Parametric metric models for multi-labelled text. In: Proceedings of Neural Information Processing Systems Foundation (NIPS 2002), vol. 2 (2002)
Ueda, N., Saito, K.: Single-shot detection of multiple categories of text using parametric mixture models. In: Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada (2002)
Wang, H., Huang, H., Ding, C.: Image annotation using bi-relational graph of images and semantic labels. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011). IEEE (2011)
Wright, J., Ma, Y.: Dense error correction via l1-minimization. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009), pp. 3033–3036. IEEE (2009)
Xia, J., Caragea, D., Hsu, W.H.: Bi-relational network analysis using a fast random walk with restart. In: 2009 Ninth IEEE International Conference on Data Mining (ICDM’09), pp. 1052–1057, Miami, USA, IEEE (2009)
Yan, S., Wang, H.: Semi-supervised learning by sparse representation. In: SDM, pp. 792–801. SIAM (2009)
Yiming, Y.: An evaluation of statistical approaches to text categorization. Inf. Retr. 1(1-2), 69–90 (1999)
Google Scholar
Yao, L., Sheng, Q.Z.: Exploiting latent relevance for relational learning of ubiquitous things. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), pp. 1547–1551. ACM (2012)
Ye, M., Shou, D., Lee, W.C., Yin, P., Janowicz, K.: On the semantic annotation of places in location-based social networks. In: Proceeding of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2011), San Diego, CA, USA, August (2011)
Yin, Z., Li, R., Mei, Q., Han, J.: Exploring social tagging graph for Web object classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009), pp. 957–966. ACM (2009)
Yu, K., Yu, S., Tresp, V.: Multi-label informed latent semantic indexing. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 258–265. ACM (2005)
Zhang, K., Kwok, J.T, Parvin, B.: Prototype vector machine for large scale semi-supervised learning. In: Proceedings of the 26th International Conference on Machine Learning (ICML’09), pp 1233–1240 (2009)
Zhang, M.L., Zhang, K.: Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2010)
Zhang, Y., Zhang, W., Pei, J., Lin, X., Lin, Q., Li, A.: Consensus-based ranking of multivalued objects: A generalized borda count approach. IEEE Trans. Knowl. Data Eng. 26(1), 83–96 (2014)
Article Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. Advances in Neural Information Processing Systems 16, 321–328 (2004)
Google Scholar
Zhu, S., Ji, X., Xu, W., Gong, Y.: Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 274–281. ACM (2005)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003), Washington, USA (2003)
Zhu, X., Goldberg, A.B.: Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 3(1), 1–130 (2009)
Article MATH Google Scholar

Download references

Acknowledgements

Quan Z. Sheng’s work has been partially supported by Australian Research Council (ARC) Discovery Grant DP140100104 and Future Fellowship Project FT140101247. Xue Li’s work has been partially supported by Australian Research Council (ARC) Discovery Grant DP130104614. The authors would like to thank the anonymous reviewers for their valuable feedback on this work.

Author information

Authors and Affiliations

School of Computer Science, The University of Adelaide, Adelaide, SA, 5005, Australia
Lina Yao & Quan Z. Sheng
Department of Computer Science, Texas State University, San Marcos, TX, 78666-4616, USA
Anne H. H. Ngu & Byron J. Gao
School of Information Technology and Electrical Engineering, The University of Queensland, Queensland, QLD, 4072, Australia
Xue Li & Sen Wang

Authors

Lina Yao
View author publications
You can also search for this author inPubMed Google Scholar
Quan Z. Sheng
View author publications
You can also search for this author inPubMed Google Scholar
Anne H. H. Ngu
View author publications
You can also search for this author inPubMed Google Scholar
Byron J. Gao
View author publications
You can also search for this author inPubMed Google Scholar
Xue Li
View author publications
You can also search for this author inPubMed Google Scholar
Sen Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Lina Yao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, L., Sheng, Q.Z., Ngu, A.H.H. et al. Multi-label classification via learning a unified object-label graph with sparse representation. World Wide Web 19, 1125–1149 (2016). https://doi.org/10.1007/s11280-015-0376-7

Download citation

Received: 06 January 2015
Revised: 02 July 2015
Accepted: 05 November 2015
Published: 27 November 2015
Issue Date: November 2016
DOI: https://doi.org/10.1007/s11280-015-0376-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label classification via learning a unified object-label graph with sparse representation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-label classification via closed frequent labelsets and label taxonomies

A Graph-based Semi-supervised Multi-label Learning Method Based on Label Correlation Consistency

Semi-supervised multi-label feature selection with local logic information preserved

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now