Multi-label multi-instance learning with missing object tags

Shen, Yi; Peng, Jinye; Feng, Xiaoyi; Fan, Jianping

doi:10.1007/s00530-012-0290-0

Multi-label multi-instance learning with missing object tags

Regular Paper
Published: 14 August 2012

Volume 19, pages 17–36, (2013)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Yi Shen¹,
Jinye Peng²,
Xiaoyi Feng² &
…
Jianping Fan¹

484 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, a novel framework is developed for leveraging large-scale loosely tagged images for object classifier training by addressing three key issues jointly: (a) spam tags e.g., some tags are more related to popular query terms rather than the image semantics; (b) loose object tags, e.g., multiple object tags are loosely given at the image level without identifying the object locations in the images; (c) missing object tags, e.g., some object tags are missed and thus negative bags may contain positive instances. To address these three issues jointly, our framework consists of the following key components for leveraging large-scale loosely tagged images for object classifier training: (1) distributed image clustering and inter-cluster visual correlation analysis for handling the issue of spam tags by filtering out large amounts of junk images automatically, (2) multiple instance learning with missing tag prediction for dealing with the issues of loose object tags and missing object tags jointly; (3) structural learning for leveraging the inter-object visual correlations to train large numbers of inter-related object classifiers jointly. Our experiments on large-scale loosely tagged images have provided very positive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Multiple Instance Learning for Automatic Image Annotation

Multi-label Learning with Missing Labels Based on Instance-Wise and Label-Wise Correlations for Image Classification

References

Smeulders, A.W.M., Worring, M., Santini, S., Gupta, S., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. on PAMI, (2000)
Rui, Y., Huang, T.S., Chang, S.-F.: Image retrieval: current techniques, promising directions and open issues. J. Vis. Commun. Image Represent. 10, 39–62 (1999)
Article Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2) (2008)
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. CVPR, Colorado (2006)
Google Scholar
Berg, T., Berg, A., Edwards, J., Mair, M., White, R., Yeh, Y., Learned-Miller, E., Forsyth, D.: Names and faces in the news. CVPR, Colorado (2004)
Google Scholar
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. ICCV, Rio de Janeiro (2007)
Google Scholar
Quattoni, A., Collins, M., Darrell, T.: Learning visual representations using images with captions. CVPR, Colorado (2007)
Google Scholar
Ben-Haim, N., Babenko, B., Belongie, S.: Improving image search via content based clustering. CVPR SLAM, Colorado (2006)
Google Scholar
Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of WWW image search results using visual, textual, and link information. ACM Multimedia, New York (2004)
Google Scholar
Fan, J., Shen, Y., Zhou, N., Gao, Y.: Harvesting large-scale weakly-tagged image databases from the Web. IEEE CVPR, Colorado (2010)
Google Scholar
Deng, Y., Manjunath, B.S.: Color image segmentation. IEEE CVPR, Colorado (1999)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. on PAMI (2000)
Russell, B., Efros, A., Sivic, J., Freeman, W., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. IEEE CVPR, Colorado (2006)
Russell, B., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Intl. J. Comput. Vision 77(1) (2008)
Griffin, G., Holub, A., Perona, P.: The Caltech-256, Caltech Technical Report
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. CVPR, Colorado (2004)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. CVPR, Colorado (2009)
Google Scholar
Flickr, http://www.flickr.com
von Ahn, L., Dabbish, L.: Labeling images with a computer game. ACM CHI, Paris (2004)
Google Scholar
Frey, B.J., Dueck, D.: Clustering by Passing Messages Between Data Points. Science (2007)
Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. CVPR, Colorado (2008)
Vijayanarasimhan, S., Grauman, K.: What’s it going to cost you?: predicting effort vs. informativeness for multi-label image annotations. CVPR, Colorado (2009)
Galleguillos C., Babenko B., Rabinovich A., Belongie S.J.: Weakly supervised object localization with stable segmentations. ECCV, Denver, pp.193–207 (2008)
Cour, T., Sapp, B., Jordan, C., Taskar, B.: Learning from ambiguously labeled images. CVPR, Colorado (2009)
Google Scholar
Rosenberg, C.R., Hebert, M.: Training object detection models with weakly labeled data. BMVC, Guildford (2002)
Google Scholar
Syed, U., Taskar, B.: Semi-supervised learning with adversarially missing label information. NIPS, Okazaki (2010)
Google Scholar
Zhang, Q., Yu, W., Goldman, S.A., Fritts, J.E.: Content-based image retrieval using multiple-instance learning. ICML, Una (2002)
Google Scholar
Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. ICML, Una (1998)
Google Scholar
Chen, Y., Bi, J., Wang, J. Z.: MILES: multiple instance learning via embedded instance selection. IEEE Trans. PAMI 28(12), 1931–1947 (2006)
Article Google Scholar
Viola, P., Platt, J.C., Zhang, C.: Multiple instance boosting for object detection. ICML, Una (2006)
Google Scholar
Tang, J., Hua, X., Wang, M., Gu, Z., Qi, G., Wu, X.: Correlative linear neighborhood propagation for video annotation. IEEE Trans. on SMC 39(2), 409–416 (2009)
Google Scholar
Qi G.-J., Hua X.-S., Rui Y., Tang J., Mei T., Zhang H.-J. Correlative multi-label video annotation. ACM Multimedia, San Francisco, pp.17–26 (2007)
Zha, Z., Hua, X.-S., Mei, T., Wang, J., Qi, G.-J., Wang, Z.: Joint multi-label multi-instance learning for image classification. CVPR, Colorado (2008)
Google Scholar
Zhou, Z.H., Zhang, M.-L.: Multi-instance multi-label learning with application to scene classification. NIPS, Okazaki (2006)
Google Scholar
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. ICCV, Rio de Janeiro (2007)
Google Scholar
Kumar, S., Hebert, M.: Discriminative random fields. Intl. J. Comput. Vision (2006)
Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. CVPR, Colorado (2008)
Google Scholar
Jiang, W., Chang, S.-F., Loui, A.: Context-based concept fusion with boosted conditional random fields. IEEE ICASSP, Canada (2007)
Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters, OSDI’04, Berkeley (2004)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random field: Probabilistic models for segmenting and labeling sequence data. Proc. ICML (2001)
Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMs. Machine Learn. 77(1), 27–59 (2009)
Article MATH Google Scholar
Blaschko, M., Lampert, C.: Learning to localize objects with structured output regression. ECCV, LNCS 5302, pp. 2–15, (2008)
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multi-class object detection. IEEE CVPR, (2004)
Fan, J., Gao, Y., Luo, H., Jain, R.: Mining multilevel image semantics via hierarchical classification. IEEE Trans. on Multimedia 10(2) (2008)
Fan, J., Gao, Y., Luo, H.: Integrating concept ontology and multi-task learning to achieve more effective classifier training for multi-level image annotation. IEEE Trans. on Image Process. 17(3), 407–426 (2008)
Article MathSciNet Google Scholar
Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. J. Machine Learn. Res. 6, 615–637 (2005)
MathSciNet MATH Google Scholar
Yang, J., Liu, Y., Ping, E.X., Hauptmann, A.G.: Harmonium models for semantic video representation and classification. SIAM Conf. on Data Mining, (2007)
Chen, M.-Y., Hauptmann, A.G.: Discriminative fields for modeling semantic concepts in video. RIAO Large-Scale Semantic Access to Content, May 30–June 1, (2007)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. PAMI, (2009)
Yuan, X., Yan, S.: Visual classification with multi-task joint sparse representation. IEEE CVPR, pp. 3493–3500, (2010)
Tian, Q., Zhang, S., Zhou, W., Ji, R., Ni, B., Sebe, N.: Building descriptive and discriminative visual codebook for large-scale image applications. Multimed. Tools Appl. 51(2), 441–477 (2011)
Article Google Scholar
Fan, J., Keim, D., Gao, Y., Luo, H., Li, Z.: JustClick: Personalized image recommendation via exploratory search from large-scale Flickr images. IEEE Trans. on CSVT 19(2), 273–288 (2009)
Google Scholar
Sebe, N., Lew, M., Huijsmans, D.: Multi-scale sub-image search, pp. 79–82. ACM Multimedia, San Francisco (1999)
Google Scholar
Jaimes, A., Chang, S.-F., Loui, A.C.: Detection of non-identical duplicate consumer photographs. Proc. PCM (2003)
Wang, B., Li, Z., Li, M., Ma, W.-Y.: Large-scale duplicate detection for web image search. IEEE ICME, Stanford (2006)
Google Scholar
Ke, Y., Sukthankar, R., Huston, L.: Effective near-duplicate detection and sub-image retrieval. ACM Multimedia, San Francisco (2004)
Google Scholar
Zhang D., Chang S.-F.: Detecting image near-duplicate by stochastic attributed relational graph matching with learning. ACM Multimedia, San Francisco (2004)
Meng, Y., Chang, E., Li, B.: Enhancing dpf for near-replica image recognition. IEEE CVPR, NU (2003)
Google Scholar
Wu, X., Ngo, C.-W., Hauptmann, A.G., Tan, H.K.: Real-time near-duplicate elimination for web video search with content and context. IEEE Trans. on Multimedia 11(2), 196–207 (2009)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceeding of Ninth IEEE International Conference on Computer Vision (2003)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classication. In: CVPR, Colorado (2009)
Shen, Y., Peng, J., Feng, X., Fan, J.: Multiple instance learning with missing object tags. In: ICIMCS (2011)

Download references

Acknowledgements

This work is partly supported by NSFC-61075014 and NSFC-60875016, by the Program for New Century Excellent Talents in University under Grant NCET-08-0458 and NCET-10-0071 and the Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20096102110025 and No. 20106102110028).

Author information

Authors and Affiliations

Department of Computer Science, University of North Carolina, Charlotte, NC, 28223, USA
Yi Shen & Jianping Fan
School of Electronics and Information, Northwestern Polytechnical University, Xian, China
Jinye Peng & Xiaoyi Feng

Authors

Yi Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jinye Peng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyi Feng
View author publications
You can also search for this author in PubMed Google Scholar
Jianping Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, Y., Peng, J., Feng, X. et al. Multi-label multi-instance learning with missing object tags. Multimedia Systems 19, 17–36 (2013). https://doi.org/10.1007/s00530-012-0290-0

Download citation

Published: 14 August 2012
Issue Date: February 2013
DOI: https://doi.org/10.1007/s00530-012-0290-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label multi-instance learning with missing object tags

Abstract

Access this article

Similar content being viewed by others

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Multiple Instance Learning for Automatic Image Annotation

Multi-label Learning with Missing Labels Based on Instance-Wise and Label-Wise Correlations for Image Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-label multi-instance learning with missing object tags

Abstract

Access this article

Similar content being viewed by others

Improving Multi-label Learning with Missing Labels by Structured Semantic Correlations

Multiple Instance Learning for Automatic Image Annotation

Multi-label Learning with Missing Labels Based on Instance-Wise and Label-Wise Correlations for Image Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation