Skip to main content

Automatic tag-to-region assignment via multiple instance learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Translating image tags at the image level to regions (i.e., tag-to-region assignment), which could play an important role in leveraging loosely-labeled training images for object classifier training, has become a popular research topic in the multimedia research community. In this paper, a novel two-stage multiple instance learning algorithm is presented for automatic tag-to-region assignment. The regions are generated by performing multiple-scale image segmentation and the instances with unique semantics are selected out from those regions by a random walk process. The affinity propagation (AP) clustering technique and Hausdorff distance are performed on the instances to identify the most positive instance and utilize it to initialize the maximum searching of Diverse Density likelihood in the first stage. In the second stage, the most contributive instance, which is chosen from each bag, is treated as the key instance for simplifying the computing procedure of Diverse Density likelihood. At last, an automatic method is proposed to discriminate the boundary between positive instances and negative instances. Our experiments on three well-known image sets have provided positive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://vision.ece.ucsb.edu/segmentation/jseg/

  2. http://research.microsoft.com/en-us/projects/objectclassrecognition/, we use the version 2.0.

  3. These object categories are presented in Fig. 8 and Appendix

  4. The LIBSVM [4] is used as SVM implementation included in mi-SVM and RW-SVM.

References

  1. Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. Adv Neural Inf Proc Syst 15:561–568

    Google Scholar 

  2. Bunescu R, Mooney R (2007) Multiple instance learning for sparse positive bags. In: Proceedings of the 24th International Conference on Machine Learning (ICML), pp 105–112

  3. Carneiro G, Chan A, Moreno P, Vasconcelos N (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell29(3):394–410

    Article  Google Scholar 

  4. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27

    Article  Google Scholar 

  5. Chen Y, Bi J, Wang J (2006) Miles: multiple-instance learning via embedded instance selection. IEEE Trans Pattern Anal Mach Intell 28(12):1931–1947

    Article  Google Scholar 

  6. Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from National University of Singapore. In: Proceeding of the ACM international conference on image and video retrieval, p 48

  7. Coleman TF, Li Y (1996) An interior trust region approach for nonlinear minimization subject to bounds. SIAM J Optim 6(2):418–445

    Article  MATH  MathSciNet  Google Scholar 

  8. Cusano C, Ciocca G, Schettini R (2004) Image annotation using svm. In: Society of Photo-Optical Instrumentation Engineers conference (SPIE), vol 5304, pp 330–338

  9. Deng Y, Manjunath B, Shin H (1999) Color image segmentation. In: IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), vol 2

  10. Dietterich T, Lathrop R, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71

    Article  MATH  Google Scholar 

  11. Fan J, Shen Y, Zhou N, Gao Y (2010) Harvesting large-scale weakly-tagged image databases from the web. In: IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), pp 802–809

  12. Frey B, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972

    Article  MATH  MathSciNet  Google Scholar 

  13. Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, pp 119–126

  14. Lew M, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimed Comput Commun Appl (TOMCCAP) 2(1):1–19

    Article  Google Scholar 

  15. Liu D, Hua X, Zhang H (2011) Content-based tag processing for internet social images. Multimed Tools Appl 51:723–738

    Article  Google Scholar 

  16. Liu D, Yan S, Rui Y, Zhang H (2010) Unified tag analysis with multi-edge graph. In: Proceedings of the international conference on Multimedia (ACM MM), pp 25–34

  17. Li F, Fergus R, Torralba A (2007) Recognizing and learning object categories. cvpr 2007 short course

  18. Li J, Wang J (2008) Real-time computerized annotation of pictures. IEEE Trans Pattern Anal Mach Intell 30(6):985–1002

    Article  Google Scholar 

  19. Liu S, Yan S, Zhang T, Xu C, Liu J, Lu H (2012) Weakly-supervised graph propagation towards collective image parsing. IEEE Trans Multimedia 14(2):361–373

    Article  Google Scholar 

  20. Liu X, Cheng B, Yan S, Tang J, Chua T, Jin H (2009) Label to region by bi-layer sparsity priors. In: Proceedings of the 17th ACM international conference on multimedia, pp 115–124

  21. Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. In: Advances in neural information processing systems, pp 570–576

  22. Maron O, Ratan A (1998) Multiple-instance learning for natural scene classification. In: Proceedings of the fifteenth international conference on machine learning, vol 15, pp 341–349

  23. Platt J, et al (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report msr-tr-98-14, Microsoft Research

  24. Qi G, Hua X, Rui Y, Mei T, Tang J, Zhang H (2007) Concurrent multiple instance learning for image categorization. In: IEEE conference Computer Vision and Pattern Recognition (CVPR), pp 1–8

  25. Russell B, Freeman W, Efros A, Sivic J, Zisserman A (2006) Using multiple segmentations to discover objects and their extent in image collections. In: IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), pp 1605–1614

  26. Shen Y, Fan J (2010) Leveraging loosely-tagged images and inter-object correlations for tag recommendation. In: Proceedings of the international conference on Multimedia (ACM MM), pp 5–14

  27. Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Article  Google Scholar 

  28. Tang J, Hong R, Yan S, Chua T, Qi G, Jain R (2011) Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans Intell Syst Technol 2(2):14

    Article  Google Scholar 

  29. Vijayanarasimhan S, Grauman K (2008) Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. In: IEEE conference Computer Vision and Pattern Recognition (CVPR), pp 1–8

  30. Viola P, Platt J, Zhang C (2006) Multiple instance boosting for object detection. Adv Neural Inf Proc Syst 18:1417

    Google Scholar 

  31. Wang D, Li J, Zhang B (2006) Multiple-instance learning via random walk. In: Machine learning: ECML 2006, pp 473–484

  32. Wang J, Zucker J (2000) Solving the multiple-instance problem: a lazy learning approach. In: Proc. 17th international conf. on machine learning, pp 1119–1125

  33. Yang K, Hua X, Wang M, Zhang H (2011) Tag tagging: towards more descriptive keywords of image content. IEEE Trans Multimedia 13(4):662–673

    Article  Google Scholar 

  34. Zha Z, Hua X, Mei T, Wang J, Qi G, Wang Z (2008) Joint multi-label multi-instance learning for image classification. In: IEEE conference Computer Vision and Pattern Recognition (CVPR), pp 1–8

  35. Zhang M, Zhou Z (2009) Multi-instance clustering with applications to multi-instance prediction. Appl Intell 31(1):47–68

    Article  Google Scholar 

  36. Zhang Q, Goldman S (2001) Em-dd: an improved multiple-instance learning technique. Adv Neural Inf Proc Syst 14:1073–1080

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Jonathan Fortune for language polish. This work is partly supported by the doctorate foundation of Northwestern Polytechnical University (No: CX201113), Doctoral Program of Higher Education of China (Grant No.20106102110028 and 20116102110027) and National Science Foundation of China (under Grant No.61075014 and 61272285).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhaoqiang Xia.

Appendix: The part 2 and 3 of experiments on COREL30K

Appendix: The part 2 and 3 of experiments on COREL30K

Fig. 10
figure 10

Average accuracy on the 92 categories (part 2 and 3 of 121 categories) of COREL30K dataset using 7 approaches: a mi-SVM; b RW-SVM; c EM-DD; d Our Method; e Our Method without MIG; f Our Method without CI; g Our Method without TS

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xia, Z., Shen, Y., Feng, X. et al. Automatic tag-to-region assignment via multiple instance learning. Multimed Tools Appl 74, 979–1002 (2015). https://doi.org/10.1007/s11042-013-1707-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1707-2

Keywords