Skip to main content
Log in

A diversity-based search approach to support annotation of a large fish image dataset

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Label propagation consists in annotating an unlabeled dataset starting from a set of labeled items. However, most current methods exploit only image similarity between labeled and unlabeled images in order to find propagation candidates, which may result, especially in very large datasets, in retrieving mostly near-duplicate images. While such approaches are technically correct, as they maximize the propagation precision, the resulting annotated dataset may not be as useful, since they lack intra-class variability within the set of images sharing the same label. In this paper, we propose an approach for label propagation which favors the propagation of an object’s label to a set of images representing as many different views of that object as possible, while at the same time preserving the relevance of the retrieved items to the query. Our method is based on a diversity-based clustering technique using a random forest framework and a label propagation approach which is able to effectively and efficiently propagate annotations using a similarity-based approach operating on clusters. The method was tested on a very large dataset of fish images achieving good performance in automated label propagation, ensuring diversification of the annotated items while preserving precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://www.xeno-canto.org/.

  2. http://www.plantnet-project.org/.

  3. http://www.fish4knowledge.eu.

References

  1. Vondrick, C., Patterson, D., Ramanan, D.: Efficiently scaling up crowdsourced video annotation—a set of best practices for high quality, economical video labeling. Int. J. Comput. Vis. 101(1), 184–204 (2013)

    Article  Google Scholar 

  2. Khosla, A., Yao, B., Fei-Fei, L.: Integrating randomization and discrimination for classifying human-object interaction activities. In: Human-Centered Social Media Analytics (2014)

  3. Boom, B.J., Huang, P.X., Fisher, R.B.: Approximate nearest neighbor search to support manual image annotation of large domain-specific datasets. In: Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications (VIGTA’13), pp. 4:1–4:8. ACM, New York (2013). doi:10.1145/2501105.2501112

  4. Boom, B., Huang, P., He, J., Fisher, R.: Supporting ground-truth annotation of image datasets using clustering. In: Pattern Recognition (ICPR), 2012 21st International Conference on, pp. 1542–1545 (2012)

  5. Giordano, D., Kavasidis, I., Palazzo, S., Spampinato, C.: Nonparametric label propagation using mutual local similarity in nearest neighbors. Comput. Vis. Image Underst. 131, 116–127 (2015). special section: Large Scale Data-Driven Evaluation in Computer Vision

    Article  Google Scholar 

  6. Goffman, W.: A searching procedure for information retrieval. Inf. Storage Retr. 2(2), 73–78 (1964)

    Article  MATH  Google Scholar 

  7. Ionescu, B., Popescu, A., Radu, A.-L., Müller, H.: Result diversification in social image retrieval: a benchmarking framework. Multimed. Tools Appl. 1–31 (2014). doi:10.1007/s11042-014-2369-4

  8. van Leuken, R.H., Garcia, L., Olivares, X., van Zwol, R.: Visual diversification of image search results. In: Proceedings of the 18th International Conference on World Wide Web (WWW’09), pp. 341–350. ACM, New York (2009)

  9. Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: Computer Vision, 2009 IEEE 12th International Conference on (2009)

  10. Kang, F.K.F., Jin, R.J.R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2 (2006). doi:10.1109/CVPR.2006.90

  11. Feng, Z., Jin, R., Jain, A.: Large-scale image annotation by efficient and robust kernel metric learning. In: Proceedings of International Conference on Computer Vision (ICCV) (2013)

  12. Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Proc. ECCV, pp. 316–329 (2008). doi:10.1007/978-3-540-88690-7; http://www.cs.rutgers.edu/~vladimir/pub/makadia08eccv.pdf

  13. Guillaumin, M., Verbeek, J., Schmid, C.: Is that you? Metric learning approaches for face identification. In: Computer Vision, 2009 IEEE 12th International Conference on (2009)

  14. Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning distance functions for image retrieval. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 (CVPR’04), vol. 2. doi:10.1109/CVPR.2004.1315215

  15. Jin, R.J.R., Wang, S.W.S., Zhou, Z.-H.: Learning a distance metric from multi-instance multi-label data. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. doi:10.1109/CVPR.2009.5206684

  16. Verma, Y., Jawahar, C.V.: Image annotation using metric learning in semantic neighbourhoods. In: Computer Vision—ECCV 2012, pp. 836–849. Springer, Berlin (2012)

  17. Cai, X., Wang, H., Huang, H., Ding, C.: Simultaneous image classification and annotation via biased random walk on tri-relational graph. In: Computer Vision—ECCV 2012, pp. 823–836. Springer, Berlin (2012)

  18. Liu, J., Li, M., Liu, Q., Lu, H., Ma, S.: Image annotation via graph learning (2009). doi:10.1016/j.patcog.2008.04.012

  19. Li, J.L.J., Wang, J. : Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Pattern Anal. Mach. Intell. 25(9). doi:10.1109/TPAMI.2003.1227984

  20. Yang, C., Dong, M., Hua, J.: Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2. doi:10.1109/CVPR.2006.250

  21. Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004 (CVPR’04), vol. 2. doi:10.1109/CVPR.2004.1315274

  22. Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’03), pp. 119–126. (2003). doi:10.1145/860435.860459; http://dl.acm.org/citation.cfm?id=860435.860459

  23. Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: 16th Conference on Advances in Neural Information Processing Systems (NIPS), pp. 553–560. http://books.nips.cc/papers/files/nips16/NIPS2003_AA70.pdf (2003)

  24. Barnard, K., Forsyth, D., Jordan, M.I., Duygulu, P., de Freitas, N., Blei, D.M.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003). doi:10.1162/153244303322533214

    MATH  Google Scholar 

  25. Vajda, S., You, D., Antani, S., Thoma, G.: Label the many with a few: semi-automatic medical image modality discovery in a large image collection. In: Computational Intelligence in Healthcare and e-health (CICARE), 2014 IEEE Symposium on, pp. 167–173 (2014). doi:10.1109/CICARE.2014.7007850

  26. Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007). doi:10.1007/s11263-006-0007-y

    Article  Google Scholar 

  27. Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of www image search results using visual. In: Association for Computing Machinery, Inc., (2004)

  28. Jing, F., Wang, C., Yao, Y., Deng, K., Zhang, L., Ma, W.-Y.: Igroup: web image search results clustering. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, MULTIMEDIA’06, pp. 377–384. ACM, New York (2006)

  29. Dang-Nguyen, D., Piras, L., Giacinto, G., Boato, G., F.G.B., D.N.: Retrieval of diverse images by pre-filtering and hierarchical clustering. In: Proceedings of the MediaEval 2014 Workshop, CEUR-WS.org (2014)

  30. Jaimes, A., Chang, S.-F., Loui, A.: Detection of non-identical duplicate consumer photographs. In: Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference on, vol. 1, pp. 16–20 (2003). doi:10.1109/ICICS.2003.1292404

  31. Dong, W., Wang, Z., Charikar, M., Li, K.: High-confidence near-duplicate image detection. In: Proceedings of the 2Nd ACM International Conference on Multimedia Retrieval, ICMR’12, pp. 1:1–1:8. ACM, New York (2012). doi:10.1145/2324796.2324798

  32. Srinivasan, S.H., Sawant, N.: Finding near-duplicate images on the web using fingerprints. In: Proceedings of the 16th ACM International Conference on Multimedia, MM’08, pp. 881–884. ACM, New York (2008). doi:10.1145/1459359.1459512

  33. Spampinato, C., Palazzo, S.: PeRCeiVe Lab@UNICT at mediaeval 2014 diverse images: random forests for diversity-based clustering. In: Proceedings of the MediaEval 2014 Workshop, CEUR-WS.org (2014)

  34. Shi, T., Horvath, S.: Unsupervised learning with random forest predictors. J. Comput. Graph. Stat. 15(1), 118–138 (2006)

    Article  MathSciNet  Google Scholar 

  35. Park, H.-S., Jun, C.-H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009). doi:10.1016/j.eswa.2008.01.039

    Article  Google Scholar 

  36. Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–159 (2004)

    Article  MATH  Google Scholar 

  37. Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1124–1137 (2004)

    Article  MATH  Google Scholar 

  38. Stricker, M., Orengo, M.: Similarity of color images. In: Proc. SPIE Storage and Retrieval for Image and Video Databases, vol. 2420, pp. 381–392. Citeseer (1995)

  39. Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of 12th International Conference on Pattern Recognition, vol. 1

  40. Weiss, Y., Fergus, R., Torralba, A.: Multidimensional spectral hashing. In: European Conference on Computer Vision. LNCS, vol. 7576, pp. 340–353. Springer, Berlin (2012). doi:10.1007/978-3-642-33715-4_25

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Palazzo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Giordano, D., Palazzo, S. & Spampinato, C. A diversity-based search approach to support annotation of a large fish image dataset. Multimedia Systems 22, 725–736 (2016). https://doi.org/10.1007/s00530-015-0491-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-015-0491-4

Keywords

Navigation