Skip to main content
Log in

Multi-label multi-instance learning with missing object tags

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

In this paper, a novel framework is developed for leveraging large-scale loosely tagged images for object classifier training by addressing three key issues jointly: (a) spam tags e.g., some tags are more related to popular query terms rather than the image semantics; (b) loose object tags, e.g., multiple object tags are loosely given at the image level without identifying the object locations in the images; (c) missing object tags, e.g., some object tags are missed and thus negative bags may contain positive instances. To address these three issues jointly, our framework consists of the following key components for leveraging large-scale loosely tagged images for object classifier training: (1) distributed image clustering and inter-cluster visual correlation analysis for handling the issue of spam tags by filtering out large amounts of junk images automatically, (2) multiple instance learning with missing tag prediction for dealing with the issues of loose object tags and missing object tags jointly; (3) structural learning for leveraging the inter-object visual correlations to train large numbers of inter-related object classifiers jointly. Our experiments on large-scale loosely tagged images have provided very positive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Smeulders, A.W.M., Worring, M., Santini, S., Gupta, S., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. on PAMI, (2000)

  2. Rui, Y., Huang, T.S., Chang, S.-F.: Image retrieval: current techniques, promising directions and open issues. J. Vis. Commun. Image Represent. 10, 39–62 (1999)

    Article  Google Scholar 

  3. Datta, R., Joshi, D., Li, J., Wang, J.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2) (2008)

  4. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. CVPR, Colorado (2006)

    Google Scholar 

  5. Berg, T., Berg, A., Edwards, J., Mair, M., White, R., Yeh, Y., Learned-Miller, E., Forsyth, D.: Names and faces in the news. CVPR, Colorado (2004)

    Google Scholar 

  6. Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. ICCV, Rio de Janeiro (2007)

    Google Scholar 

  7. Quattoni, A., Collins, M., Darrell, T.: Learning visual representations using images with captions. CVPR, Colorado (2007)

    Google Scholar 

  8. Ben-Haim, N., Babenko, B., Belongie, S.: Improving image search via content based clustering. CVPR SLAM, Colorado (2006)

    Google Scholar 

  9. Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of WWW image search results using visual, textual, and link information. ACM Multimedia, New York (2004)

    Google Scholar 

  10. Fan, J., Shen, Y., Zhou, N., Gao, Y.: Harvesting large-scale weakly-tagged image databases from the Web. IEEE CVPR, Colorado (2010)

    Google Scholar 

  11. Deng, Y., Manjunath, B.S.: Color image segmentation. IEEE CVPR, Colorado (1999)

    Google Scholar 

  12. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. on PAMI (2000)

  13. Russell, B., Efros, A., Sivic, J., Freeman, W., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. IEEE CVPR, Colorado (2006)

  14. Russell, B., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Intl. J. Comput. Vision 77(1) (2008)

  15. Griffin, G., Holub, A., Perona, P.: The Caltech-256, Caltech Technical Report

  16. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. CVPR, Colorado (2004)

  17. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. CVPR, Colorado (2009)

    Google Scholar 

  18. Flickr, http://www.flickr.com

  19. von Ahn, L., Dabbish, L.: Labeling images with a computer game. ACM CHI, Paris (2004)

    Google Scholar 

  20. Frey, B.J., Dueck, D.: Clustering by Passing Messages Between Data Points. Science (2007)

  21. Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. CVPR, Colorado (2008)

  22. Vijayanarasimhan, S., Grauman, K.: What’s it going to cost you?: predicting effort vs. informativeness for multi-label image annotations. CVPR, Colorado (2009)

  23. Galleguillos C., Babenko B., Rabinovich A., Belongie S.J.: Weakly supervised object localization with stable segmentations. ECCV, Denver, pp.193–207 (2008)

  24. Cour, T., Sapp, B., Jordan, C., Taskar, B.: Learning from ambiguously labeled images. CVPR, Colorado (2009)

    Google Scholar 

  25. Rosenberg, C.R., Hebert, M.: Training object detection models with weakly labeled data. BMVC, Guildford (2002)

    Google Scholar 

  26. Syed, U., Taskar, B.: Semi-supervised learning with adversarially missing label information. NIPS, Okazaki (2010)

    Google Scholar 

  27. Zhang, Q., Yu, W., Goldman, S.A., Fritts, J.E.: Content-based image retrieval using multiple-instance learning. ICML, Una (2002)

    Google Scholar 

  28. Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. ICML, Una (1998)

    Google Scholar 

  29. Chen, Y., Bi, J., Wang, J. Z.: MILES: multiple instance learning via embedded instance selection. IEEE Trans. PAMI 28(12), 1931–1947 (2006)

    Article  Google Scholar 

  30. Viola, P., Platt, J.C., Zhang, C.: Multiple instance boosting for object detection. ICML, Una (2006)

    Google Scholar 

  31. Tang, J., Hua, X., Wang, M., Gu, Z., Qi, G., Wu, X.: Correlative linear neighborhood propagation for video annotation. IEEE Trans. on SMC 39(2), 409–416 (2009)

    Google Scholar 

  32. Qi G.-J., Hua X.-S., Rui Y., Tang J., Mei T., Zhang H.-J. Correlative multi-label video annotation. ACM Multimedia, San Francisco, pp.17–26 (2007)

  33. Zha, Z., Hua, X.-S., Mei, T., Wang, J., Qi, G.-J., Wang, Z.: Joint multi-label multi-instance learning for image classification. CVPR, Colorado (2008)

    Google Scholar 

  34. Zhou, Z.H., Zhang, M.-L.: Multi-instance multi-label learning with application to scene classification. NIPS, Okazaki (2006)

    Google Scholar 

  35. Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. ICCV, Rio de Janeiro (2007)

    Google Scholar 

  36. Kumar, S., Hebert, M.: Discriminative random fields. Intl. J. Comput. Vision (2006)

  37. Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. CVPR, Colorado (2008)

    Google Scholar 

  38. Jiang, W., Chang, S.-F., Loui, A.: Context-based concept fusion with boosted conditional random fields. IEEE ICASSP, Canada (2007)

  39. Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters, OSDI’04, Berkeley (2004)

  40. Lafferty, J., McCallum, A., Pereira, F.: Conditional random field: Probabilistic models for segmenting and labeling sequence data. Proc. ICML (2001)

  41. Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural SVMs. Machine Learn. 77(1), 27–59 (2009)

    Article  MATH  Google Scholar 

  42. Blaschko, M., Lampert, C.: Learning to localize objects with structured output regression. ECCV, LNCS 5302, pp. 2–15, (2008)

  43. Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multi-class object detection. IEEE CVPR, (2004)

  44. Fan, J., Gao, Y., Luo, H., Jain, R.: Mining multilevel image semantics via hierarchical classification. IEEE Trans. on Multimedia 10(2) (2008)

  45. Fan, J., Gao, Y., Luo, H.: Integrating concept ontology and multi-task learning to achieve more effective classifier training for multi-level image annotation. IEEE Trans. on Image Process. 17(3), 407–426 (2008)

    Article  MathSciNet  Google Scholar 

  46. Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. J. Machine Learn. Res. 6, 615–637 (2005)

    MathSciNet  MATH  Google Scholar 

  47. Yang, J., Liu, Y., Ping, E.X., Hauptmann, A.G.: Harmonium models for semantic video representation and classification. SIAM Conf. on Data Mining, (2007)

  48. Chen, M.-Y., Hauptmann, A.G.: Discriminative fields for modeling semantic concepts in video. RIAO Large-Scale Semantic Access to Content, May 30–June 1, (2007)

  49. Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. PAMI, (2009)

  50. Yuan, X., Yan, S.: Visual classification with multi-task joint sparse representation. IEEE CVPR, pp. 3493–3500, (2010)

  51. Tian, Q., Zhang, S., Zhou, W., Ji, R., Ni, B., Sebe, N.: Building descriptive and discriminative visual codebook for large-scale image applications. Multimed. Tools Appl. 51(2), 441–477 (2011)

    Article  Google Scholar 

  52. Fan, J., Keim, D., Gao, Y., Luo, H., Li, Z.: JustClick: Personalized image recommendation via exploratory search from large-scale Flickr images. IEEE Trans. on CSVT 19(2), 273–288 (2009)

    Google Scholar 

  53. Sebe, N., Lew, M., Huijsmans, D.: Multi-scale sub-image search, pp. 79–82. ACM Multimedia, San Francisco (1999)

    Google Scholar 

  54. Jaimes, A., Chang, S.-F., Loui, A.C.: Detection of non-identical duplicate consumer photographs. Proc. PCM (2003)

  55. Wang, B., Li, Z., Li, M., Ma, W.-Y.: Large-scale duplicate detection for web image search. IEEE ICME, Stanford (2006)

    Google Scholar 

  56. Ke, Y., Sukthankar, R., Huston, L.: Effective near-duplicate detection and sub-image retrieval. ACM Multimedia, San Francisco (2004)

    Google Scholar 

  57. Zhang D., Chang S.-F.: Detecting image near-duplicate by stochastic attributed relational graph matching with learning. ACM Multimedia, San Francisco (2004)

  58. Meng, Y., Chang, E., Li, B.: Enhancing dpf for near-replica image recognition. IEEE CVPR, NU (2003)

    Google Scholar 

  59. Wu, X., Ngo, C.-W., Hauptmann, A.G., Tan, H.K.: Real-time near-duplicate elimination for web video search with content and context. IEEE Trans. on Multimedia 11(2), 196–207 (2009)

    Article  Google Scholar 

  60. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceeding of Ninth IEEE International Conference on Computer Vision (2003)

  61. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classication. In: CVPR, Colorado (2009)

  62. Shen, Y., Peng, J., Feng, X., Fan, J.: Multiple instance learning with missing object tags. In: ICIMCS (2011)

Download references

Acknowledgements

This work is partly supported by NSFC-61075014 and NSFC-60875016, by the Program for New Century Excellent Talents in University under Grant NCET-08-0458 and NCET-10-0071 and the Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20096102110025 and No. 20106102110028).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, Y., Peng, J., Feng, X. et al. Multi-label multi-instance learning with missing object tags. Multimedia Systems 19, 17–36 (2013). https://doi.org/10.1007/s00530-012-0290-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-012-0290-0

Keywords

Navigation