Abstract
Translating image tags at the image level to regions (i.e., tag-to-region assignment), which could play an important role in leveraging loosely-labeled training images for object classifier training, has become a popular research topic in the multimedia research community. In this paper, a novel two-stage multiple instance learning algorithm is presented for automatic tag-to-region assignment. The regions are generated by performing multiple-scale image segmentation and the instances with unique semantics are selected out from those regions by a random walk process. The affinity propagation (AP) clustering technique and Hausdorff distance are performed on the instances to identify the most positive instance and utilize it to initialize the maximum searching of Diverse Density likelihood in the first stage. In the second stage, the most contributive instance, which is chosen from each bag, is treated as the key instance for simplifying the computing procedure of Diverse Density likelihood. At last, an automatic method is proposed to discriminate the boundary between positive instances and negative instances. Our experiments on three well-known image sets have provided positive results.
Similar content being viewed by others
Notes
http://research.microsoft.com/en-us/projects/objectclassrecognition/, we use the version 2.0.
The LIBSVM [4] is used as SVM implementation included in mi-SVM and RW-SVM.
References
Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. Adv Neural Inf Proc Syst 15:561–568
Bunescu R, Mooney R (2007) Multiple instance learning for sparse positive bags. In: Proceedings of the 24th International Conference on Machine Learning (ICML), pp 105–112
Carneiro G, Chan A, Moreno P, Vasconcelos N (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell29(3):394–410
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27
Chen Y, Bi J, Wang J (2006) Miles: multiple-instance learning via embedded instance selection. IEEE Trans Pattern Anal Mach Intell 28(12):1931–1947
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from National University of Singapore. In: Proceeding of the ACM international conference on image and video retrieval, p 48
Coleman TF, Li Y (1996) An interior trust region approach for nonlinear minimization subject to bounds. SIAM J Optim 6(2):418–445
Cusano C, Ciocca G, Schettini R (2004) Image annotation using svm. In: Society of Photo-Optical Instrumentation Engineers conference (SPIE), vol 5304, pp 330–338
Deng Y, Manjunath B, Shin H (1999) Color image segmentation. In: IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), vol 2
Dietterich T, Lathrop R, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71
Fan J, Shen Y, Zhou N, Gao Y (2010) Harvesting large-scale weakly-tagged image databases from the web. In: IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), pp 802–809
Frey B, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, pp 119–126
Lew M, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM Trans Multimed Comput Commun Appl (TOMCCAP) 2(1):1–19
Liu D, Hua X, Zhang H (2011) Content-based tag processing for internet social images. Multimed Tools Appl 51:723–738
Liu D, Yan S, Rui Y, Zhang H (2010) Unified tag analysis with multi-edge graph. In: Proceedings of the international conference on Multimedia (ACM MM), pp 25–34
Li F, Fergus R, Torralba A (2007) Recognizing and learning object categories. cvpr 2007 short course
Li J, Wang J (2008) Real-time computerized annotation of pictures. IEEE Trans Pattern Anal Mach Intell 30(6):985–1002
Liu S, Yan S, Zhang T, Xu C, Liu J, Lu H (2012) Weakly-supervised graph propagation towards collective image parsing. IEEE Trans Multimedia 14(2):361–373
Liu X, Cheng B, Yan S, Tang J, Chua T, Jin H (2009) Label to region by bi-layer sparsity priors. In: Proceedings of the 17th ACM international conference on multimedia, pp 115–124
Maron O, Lozano-Pérez T (1998) A framework for multiple-instance learning. In: Advances in neural information processing systems, pp 570–576
Maron O, Ratan A (1998) Multiple-instance learning for natural scene classification. In: Proceedings of the fifteenth international conference on machine learning, vol 15, pp 341–349
Platt J, et al (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report msr-tr-98-14, Microsoft Research
Qi G, Hua X, Rui Y, Mei T, Tang J, Zhang H (2007) Concurrent multiple instance learning for image categorization. In: IEEE conference Computer Vision and Pattern Recognition (CVPR), pp 1–8
Russell B, Freeman W, Efros A, Sivic J, Zisserman A (2006) Using multiple segmentations to discover objects and their extent in image collections. In: IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), pp 1605–1614
Shen Y, Fan J (2010) Leveraging loosely-tagged images and inter-object correlations for tag recommendation. In: Proceedings of the international conference on Multimedia (ACM MM), pp 5–14
Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Tang J, Hong R, Yan S, Chua T, Qi G, Jain R (2011) Image annotation by knn-sparse graph-based label propagation over noisily tagged web images. ACM Trans Intell Syst Technol 2(2):14
Vijayanarasimhan S, Grauman K (2008) Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. In: IEEE conference Computer Vision and Pattern Recognition (CVPR), pp 1–8
Viola P, Platt J, Zhang C (2006) Multiple instance boosting for object detection. Adv Neural Inf Proc Syst 18:1417
Wang D, Li J, Zhang B (2006) Multiple-instance learning via random walk. In: Machine learning: ECML 2006, pp 473–484
Wang J, Zucker J (2000) Solving the multiple-instance problem: a lazy learning approach. In: Proc. 17th international conf. on machine learning, pp 1119–1125
Yang K, Hua X, Wang M, Zhang H (2011) Tag tagging: towards more descriptive keywords of image content. IEEE Trans Multimedia 13(4):662–673
Zha Z, Hua X, Mei T, Wang J, Qi G, Wang Z (2008) Joint multi-label multi-instance learning for image classification. In: IEEE conference Computer Vision and Pattern Recognition (CVPR), pp 1–8
Zhang M, Zhou Z (2009) Multi-instance clustering with applications to multi-instance prediction. Appl Intell 31(1):47–68
Zhang Q, Goldman S (2001) Em-dd: an improved multiple-instance learning technique. Adv Neural Inf Proc Syst 14:1073–1080
Acknowledgements
The authors would like to thank Jonathan Fortune for language polish. This work is partly supported by the doctorate foundation of Northwestern Polytechnical University (No: CX201113), Doctoral Program of Higher Education of China (Grant No.20106102110028 and 20116102110027) and National Science Foundation of China (under Grant No.61075014 and 61272285).
Author information
Authors and Affiliations
Corresponding author
Appendix: The part 2 and 3 of experiments on COREL30K
Appendix: The part 2 and 3 of experiments on COREL30K
Rights and permissions
About this article
Cite this article
Xia, Z., Shen, Y., Feng, X. et al. Automatic tag-to-region assignment via multiple instance learning. Multimed Tools Appl 74, 979–1002 (2015). https://doi.org/10.1007/s11042-013-1707-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1707-2