Abstract
Label propagation consists in annotating an unlabeled dataset starting from a set of labeled items. However, most current methods exploit only image similarity between labeled and unlabeled images in order to find propagation candidates, which may result, especially in very large datasets, in retrieving mostly near-duplicate images. While such approaches are technically correct, as they maximize the propagation precision, the resulting annotated dataset may not be as useful, since they lack intra-class variability within the set of images sharing the same label. In this paper, we propose an approach for label propagation which favors the propagation of an object’s label to a set of images representing as many different views of that object as possible, while at the same time preserving the relevance of the retrieved items to the query. Our method is based on a diversity-based clustering technique using a random forest framework and a label propagation approach which is able to effectively and efficiently propagate annotations using a similarity-based approach operating on clusters. The method was tested on a very large dataset of fish images achieving good performance in automated label propagation, ensuring diversification of the annotated items while preserving precision.
Similar content being viewed by others
References
Vondrick, C., Patterson, D., Ramanan, D.: Efficiently scaling up crowdsourced video annotation—a set of best practices for high quality, economical video labeling. Int. J. Comput. Vis. 101(1), 184–204 (2013)
Khosla, A., Yao, B., Fei-Fei, L.: Integrating randomization and discrimination for classifying human-object interaction activities. In: Human-Centered Social Media Analytics (2014)
Boom, B.J., Huang, P.X., Fisher, R.B.: Approximate nearest neighbor search to support manual image annotation of large domain-specific datasets. In: Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications (VIGTA’13), pp. 4:1–4:8. ACM, New York (2013). doi:10.1145/2501105.2501112
Boom, B., Huang, P., He, J., Fisher, R.: Supporting ground-truth annotation of image datasets using clustering. In: Pattern Recognition (ICPR), 2012 21st International Conference on, pp. 1542–1545 (2012)
Giordano, D., Kavasidis, I., Palazzo, S., Spampinato, C.: Nonparametric label propagation using mutual local similarity in nearest neighbors. Comput. Vis. Image Underst. 131, 116–127 (2015). special section: Large Scale Data-Driven Evaluation in Computer Vision
Goffman, W.: A searching procedure for information retrieval. Inf. Storage Retr. 2(2), 73–78 (1964)
Ionescu, B., Popescu, A., Radu, A.-L., Müller, H.: Result diversification in social image retrieval: a benchmarking framework. Multimed. Tools Appl. 1–31 (2014). doi:10.1007/s11042-014-2369-4
van Leuken, R.H., Garcia, L., Olivares, X., van Zwol, R.: Visual diversification of image search results. In: Proceedings of the 18th International Conference on World Wide Web (WWW’09), pp. 341–350. ACM, New York (2009)
Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: Computer Vision, 2009 IEEE 12th International Conference on (2009)
Kang, F.K.F., Jin, R.J.R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2 (2006). doi:10.1109/CVPR.2006.90
Feng, Z., Jin, R., Jain, A.: Large-scale image annotation by efficient and robust kernel metric learning. In: Proceedings of International Conference on Computer Vision (ICCV) (2013)
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Proc. ECCV, pp. 316–329 (2008). doi:10.1007/978-3-540-88690-7; http://www.cs.rutgers.edu/~vladimir/pub/makadia08eccv.pdf
Guillaumin, M., Verbeek, J., Schmid, C.: Is that you? Metric learning approaches for face identification. In: Computer Vision, 2009 IEEE 12th International Conference on (2009)
Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning distance functions for image retrieval. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 (CVPR’04), vol. 2. doi:10.1109/CVPR.2004.1315215
Jin, R.J.R., Wang, S.W.S., Zhou, Z.-H.: Learning a distance metric from multi-instance multi-label data. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. doi:10.1109/CVPR.2009.5206684
Verma, Y., Jawahar, C.V.: Image annotation using metric learning in semantic neighbourhoods. In: Computer Vision—ECCV 2012, pp. 836–849. Springer, Berlin (2012)
Cai, X., Wang, H., Huang, H., Ding, C.: Simultaneous image classification and annotation via biased random walk on tri-relational graph. In: Computer Vision—ECCV 2012, pp. 823–836. Springer, Berlin (2012)
Liu, J., Li, M., Liu, Q., Lu, H., Ma, S.: Image annotation via graph learning (2009). doi:10.1016/j.patcog.2008.04.012
Li, J.L.J., Wang, J. : Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Pattern Anal. Mach. Intell. 25(9). doi:10.1109/TPAMI.2003.1227984
Yang, C., Dong, M., Hua, J.: Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2. doi:10.1109/CVPR.2006.250
Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 (CVPR’04), vol. 2. doi:10.1109/CVPR.2004.1315274
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’03), pp. 119–126. (2003). doi:10.1145/860435.860459; http://dl.acm.org/citation.cfm?id=860435.860459
Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: 16th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 553–560. http://books.nips.cc/papers/files/nips16/NIPS2003_AA70.pdf (2003)
Barnard, K., Forsyth, D., Jordan, M.I., Duygulu, P., de Freitas, N., Blei, D.M.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003). doi:10.1162/153244303322533214
Vajda, S., You, D., Antani, S., Thoma, G.: Label the many with a few: semi-automatic medical image modality discovery in a large image collection. In: Computational Intelligence in Healthcare and e-health (CICARE), 2014 IEEE Symposium on, pp. 167–173 (2014). doi:10.1109/CICARE.2014.7007850
Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007). doi:10.1007/s11263-006-0007-y
Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of www image search results using visual. In: Association for Computing Machinery, Inc., (2004)
Jing, F., Wang, C., Yao, Y., Deng, K., Zhang, L., Ma, W.-Y.: Igroup: web image search results clustering. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, MULTIMEDIA’06, pp. 377–384. ACM, New York (2006)
Dang-Nguyen, D., Piras, L., Giacinto, G., Boato, G., F.G.B., D.N.: Retrieval of diverse images by pre-filtering and hierarchical clustering. In: Proceedings of the MediaEval 2014 Workshop, CEUR-WS.org (2014)
Jaimes, A., Chang, S.-F., Loui, A.: Detection of non-identical duplicate consumer photographs. In: Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference on, vol. 1, pp. 16–20 (2003). doi:10.1109/ICICS.2003.1292404
Dong, W., Wang, Z., Charikar, M., Li, K.: High-confidence near-duplicate image detection. In: Proceedings of the 2Nd ACM International Conference on Multimedia Retrieval, ICMR’12, pp. 1:1–1:8. ACM, New York (2012). doi:10.1145/2324796.2324798
Srinivasan, S.H., Sawant, N.: Finding near-duplicate images on the web using fingerprints. In: Proceedings of the 16th ACM International Conference on Multimedia, MM’08, pp. 881–884. ACM, New York (2008). doi:10.1145/1459359.1459512
Spampinato, C., Palazzo, S.: PeRCeiVe Lab@UNICT at mediaeval 2014 diverse images: random forests for diversity-based clustering. In: Proceedings of the MediaEval 2014 Workshop, CEUR-WS.org (2014)
Shi, T., Horvath, S.: Unsupervised learning with random forest predictors. J. Comput. Graph. Stat. 15(1), 118–138 (2006)
Park, H.-S., Jun, C.-H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009). doi:10.1016/j.eswa.2008.01.039
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–159 (2004)
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1124–1137 (2004)
Stricker, M., Orengo, M.: Similarity of color images. In: Proc. SPIE Storage and Retrieval for Image and Video Databases, vol. 2420, pp. 381–392. Citeseer (1995)
Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of 12th International Conference on Pattern Recognition, vol. 1
Weiss, Y., Fergus, R., Torralba, A.: Multidimensional spectral hashing. In: European Conference on Computer Vision. LNCS, vol. 7576, pp. 340–353. Springer, Berlin (2012). doi:10.1007/978-3-642-33715-4_25
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Giordano, D., Palazzo, S. & Spampinato, C. A diversity-based search approach to support annotation of a large fish image dataset. Multimedia Systems 22, 725–736 (2016). https://doi.org/10.1007/s00530-015-0491-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-015-0491-4