skip to main content
research-article

Image label completion by pursuing contextual decomposability

Published: 22 May 2012 Publication History

Abstract

This article investigates how to automatically complete the missing labels for the partially annotated images, without image segmentation. The label completion procedure is formulated as a nonnegative data factorization problem, to decompose the global image representations that are used for describing the entire images, for instance, various image feature descriptors, into their corresponding label representations, that are used for describing the local semantic regions within images. The solution provided in this work is motivated by following observations. First, label representations of the regions with the same label often share certain commonness, yet may be essentially different due to the large intraclass variations. Thus, each label or concept should be represented by using a subspace spanned by an ensemble of basis, instead of a single one, to characterize the intralabel diversities. Second, the subspaces for different labels are different from each other. Third, while two images are similar with each other, the corresponding label representations should be similar. We formulate this cross-image context as well as the given partial label annotations in the framework of nonnegative data factorization and then propose an efficient multiplicative nonnegative update rules to alternately optimize the subspaces and the reconstruction coefficients. We also provide the theoretic proof of algorithmic convergence and correctness. Extensive experiments over several challenging image datasets clearly demonstrate the effectiveness of our proposed solution in boosting the quality of image label completion and image annotation accuracy. Based on the same formulation, we further develop a label ranking algorithms, to refine the noised image labels without any manual supervision. We compare the proposed label ranking algorithm with the state-of-the-arts over the popular evaluation databases and achieve encouragingly improvements.

References

[1]
Ahonen, T., Hadid, A., and Pietikäinen, M. 2006. Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intel. 28, 12, 2037--2041.
[2]
Belhumeur, P., Hespanha, J., and Kriegman, D. 2002. Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intel. 711--720.
[3]
Belkin, M. and Niyogi, P. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv. Neural Inform. Process. Syst.
[4]
Boutell, M., Luo, J., Shen, X., and Brown, C. 2004. Learning multilabel scene classification. Pattern Recog. 37, 9, 1757--1771.
[5]
Chen, Y., Bi, J., and Wang, J. 2006. Miles: Multiple-instance learning via embedded instance selection. IEEE Trans. Pattern Anal. Mach. Intel. 28, 12, 1931--1947.
[6]
Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y. 2009. Nus-wide: A real-world web image database from National University of Singapore. In Proceedings of the ACM Conference on Image and Video Retrieval.
[7]
Dalal, N. and Triggs, B. 2009. Histogram of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[8]
Fan, R., Chen, P., and Lin, C. 2005. Working set selection using the second order information for training svm. J. Mach. Learn. Resear. 6, 1889--1918.
[9]
Godbole, S. and Sarawagi, S. 2004. Discriminative methods for multi-labeled classification. In Advances in Knowledge Discovery and Data Mining, 22--30.
[10]
He, X., Yan, S., Hu, Y., Niyogi, P., and Zhang, H. 2005. Face recognition using laplacianfaces. IEEE Trans. Pattern Anal. Mach. Intel. 27, 3, 328--340.
[11]
Hyvarinen, A., Karhunen, J., and Oja, E. 1999. Survey on independent component analysis. Neural Comput. Surv. 2, 94--138.
[12]
J. Shotton, J. Winn, C. R. and Criminisi, A. 2006. Textonboost: Joint appearance, shape and context modeling for mulit-class object recognition and segmentation. In Proceedings of the European Conference on Computer Vision. 1--15.
[13]
Kang, F., Jin, R., and Sukthankar, R. 2006. Correlated label propagation with application to multi-label learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1719--1726.
[14]
Kuhn, H. and Tucker, A. 1951. Nonlinear programming. In Proceedings of the 2nd Berkeley Symposium.
[15]
Lee, D. and Seung, H. 1999. Learning the parts of objects by nonnegative matrix factorization. Nature 401, 788--791.
[16]
Lee, D. and Seung, H. 2001. Algorithms for non-negative matrix factorization. Adv. Neural Inform. Process. Syst. 556--562.
[17]
Liu, D., Hua, X., Yand, L., Wang, M., and Zhang, H. 2010a. Tag ranking. In Proceedings of the International World Wide Web Conference. 180--187.
[18]
Liu, X., Yan, S., and Jin, H. 2010b. Projective nonnegative graph embedding. Trans. Image Process. 19, 5, 1126--1137.
[19]
Liu, X., Yan, S., Yan, J., and Jin, H. 2009. Unified solution to nonnegative data factorization problems. In Proceedings of the IEEE Conference on Data Mining. 307--316.
[20]
Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 2, 60, 91--110.
[21]
Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 3, 145--175.
[22]
Rattenbury, T., Good, N., and Naaman, M. 2007. Towards automatic extraction of event and place semantics from flickr tags. In Proceedings of the ACM Special Interest Group on Information Retrieval.
[23]
Roweis, S. and Saul, L. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 22, 2323--2326.
[24]
Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., and Williamson, R. 2001. Estimating the support of a high-dimensional distribution. Neural Comput. 13, 7, 1443--1472.
[25]
Shi, J. and Malik, J. 1997. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intel. 22, 888--905.
[26]
Sigurbjornsson, B. and Zwol, R. 2008. Flickr tag recommendation based on collective knowl- edge. In Proceedings of the International World Wide Web Conference. 327--336.
[27]
Tenenbaum, J., Silva, V., and Langford, J. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 22, 2319--2323.
[28]
Tu, Z., Chen, X., Yuille, A., and Zhu, S. 2005. Image parsing: unifying segmentation, detection and recognition. Int. J. Comput. Vision.
[29]
Turk, M. and Pentland, A. 1991. Eigenfaces for recognition. Cognition Neurosci. 3, 71--86.
[30]
Wang, C., Song, Z., Yan, S., Zhang, L., and Zhang, H. 2009. Multiplicative nonnegative graph embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[31]
Wang, Z., Feng, J., Zhang, C., and Yan, S. 2010. Learning to rank tags. In Proceedings of the ACM Conference on Image and Video Retrieval.
[32]
Xu, X. and Frank, E. 2004. Logistic regression and boosting for labeled bags of instances. Adv. Knowl. Discov. Data Mining 3056. 272--281.
[33]
Yan, S., Xu, D., Zhang, B., Yang, Q., Zhang, H., and Lin, S. 2007. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intel. 29, 1, 40--51.
[34]
Yang, J., Yan, S., Li, X., and Huang, T. 2008. Nonnegative graph embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[35]
Yuan, J., Li, J., and Zhang, B. 2007. Exploiting spatial context constraints for automatic image region annotation. In Proceedings of the ACM Conference on Multimedia. 595--604.
[36]
Zhang, M. and Zhou, Z. 2007. Ml-knn: A lazy learning approach to multi-label learning. Pattern Recog. 40, 7, 2038--2048.
[37]
Zhou, D., Weston, J., Gretton, A., Bousquet, O., and Schoelkopf, B. 2004. Ranking on data manifolds. Adv. Neural Inform. Process. Syst. 169--176.
[38]
Zhou, Z. and Zhang, M. 2007. Multi-instance multi-label learning with application to scene classification. Adv. Neural Inform. Process. Syst. 1609--1616.

Cited By

View all
  • (2019)Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image RetaggingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.290660341:8(2027-2034)Online publication date: 1-Aug-2019
  • (2018)Ranking-Preserving Low-Rank Factorization for Image Annotation With Missing LabelsIEEE Transactions on Multimedia10.1109/TMM.2017.276198520:5(1169-1178)Online publication date: May-2018
  • (2017)Tri-Clustered Tensor Completion for Social-Aware Image Tag RefinementIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2016.260888239:8(1662-1674)Online publication date: 1-Aug-2017
  • Show More Cited By

Index Terms

  1. Image label completion by pursuing contextual decomposability

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 8, Issue 2
    May 2012
    144 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/2168996
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 May 2012
    Accepted: 01 January 2011
    Revised: 01 December 2010
    Received: 01 August 2010
    Published in TOMM Volume 8, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Image label completion
    2. Multilabel classification
    3. image annotation
    4. label ranking

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image RetaggingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2019.290660341:8(2027-2034)Online publication date: 1-Aug-2019
    • (2018)Ranking-Preserving Low-Rank Factorization for Image Annotation With Missing LabelsIEEE Transactions on Multimedia10.1109/TMM.2017.276198520:5(1169-1178)Online publication date: May-2018
    • (2017)Tri-Clustered Tensor Completion for Social-Aware Image Tag RefinementIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2016.260888239:8(1662-1674)Online publication date: 1-Aug-2017
    • (2017)Non-Linear Matrix Completion for Social Image TaggingIEEE Access10.1109/ACCESS.2016.26242675(6688-6696)Online publication date: 2017
    • (2017)Discovering visual concept structure with sparse and incomplete tagsArtificial Intelligence10.1016/j.artint.2017.05.002250:C(16-36)Online publication date: 1-Sep-2017
    • (2016)A Locality Sensitive Low-Rank Model for Image Tag CompletionIEEE Transactions on Multimedia10.1109/TMM.2016.251847818:3(474-483)Online publication date: 1-Mar-2016
    • (2016)Low-rank image tag completion with dual reconstruction structure preservedNeurocomputing10.1016/j.neucom.2014.12.121173:P2(425-433)Online publication date: 15-Jan-2016
    • (2015)Image Tagging via Cross-Modal Semantic MappingProceedings of the 23rd ACM international conference on Multimedia10.1145/2733373.2806302(1143-1146)Online publication date: 13-Oct-2015
    • (2015)QoE-aware video streaming for SVC over multiuser MIMO-OFDM systemsJournal of Visual Communication and Image Representation10.1016/j.jvcir.2014.10.01126:C(24-36)Online publication date: 1-Jan-2015
    • (2014)Image tag completion by low-rank factorization with dual reconstruction structure preserved2014 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2014.7025619(3062-3066)Online publication date: Oct-2014
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media