Abstract
An important issue in current collaborative framework for media tagging is that some images or videos may not be annotated properly or even not annotated at all. In view of this, this paper proposes a new knowledge propagation scheme to automatically propagate keywords from a subset of annotated images to the unannotated ones. The main idea is based on image content analysis and training of keyword classifiers. An evolutionary scheme is utilized to find the salient regions in the annotated images, and the importance of the other regions is estimated using one-class support vector machine (OCSVM). An ensemble of variable-length radial basis function (VLRBF)-based classifiers is trained based on the visual features of the annotated images. The trained classifiers are then used for knowledge propagation. Experimental results using 100 concept categories demonstrate the effectiveness of the proposed method.
Similar content being viewed by others
References
Amarnath, G., & Ramesh, J. (1997). Visual information retrieval. Communications of the ACM, 40(5), 70–79. doi:10.1145/253769.253798.
Cox, I. J., Miller, M. L., Minka, T. P., Papathomas, T. V., & Yianilos, P. N. (2000). The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments. IEEE Transactions on Image Processing, 9(1), 20–37. doi:10.1109/83.817596.
Flickher, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., et al. (1995). Query by image and video content: The QBIC system. IEEE Comput., 28(9), 23–32.
Gevers, T., & Smeulders, A. W. M. (2000). PicToSeek: Combining color and shape invariant features for image retrieval. IEEE Transactions on Image Processing, 9, 102–119. doi:10.1109/83.817602.
Pentland, A., Picard, R. W., & Sclaroff, S. (1996). Photobook: Content-based manipulation of image databases. International Journal of Computer Vision, 18(3), 233–254. doi:10.1007/BF00123143.
Rui, Y., Huang, T. S., Mehrotra, S. (1997). Content-based image retrieval with relevance feedback in MARS. Proc. IEEE Int. Conf. Image Processing, Washington D.C., USA, pp. 815–818.
Smith, J. R., Chang, S. F. (1996). VisualSEEk: A fully automated content-based image query system. Proc. ACM Multimedia, pp. 87–98.
Wu, K., & Yap, K. H. (2006). Fuzzy SVM for content-based image retrieval—A pseudo-label support vector machine framework. IEEE Comput. Intell. Mag., 1, 10–16.
Yap, K. H., & Wu, K. (2005). A soft relevance framework in content-based image retrieval systems. IEEE Transactions on Circuits and Systems for Video Technology, 15(12), 1557–1568. doi:10.1109/TCSVT.2005.856912.
Muneesawang, P., & Guan, L. (2002). Automatic machine interactions for content-based image retrieval using a self-organizing tree map architecture. IEEE transactions on neural networks, 13(4), 821–834. doi:10.1109/TNN.2002.1021883.
Muneesawang, P., & Guan, L. (2004). An interactive approach for CBIR using a network of radial basis functions. IEEE Transactions on Multimedia, 6(5), 703–716. doi:10.1109/TMM.2004.834866.
Muneesawang, P., & Guan, L. (2005). Using knowledge of the region of interest (ROI) in automatic image retrieval learning. Proc. Int. Joint Conference on Neural Networks, 3, 1854–1859.
Yu, Z. W., Wong, H. S. (2006). Approximate query processing for efficient content-based image retrieval based on a hierarchical SOM. Proc. Int. Joint Conference on Neural Networks, Canada: Vancouver, pp. 4013–4020.
Muneesawang, P., Wong, H. S., Lay, J., & Guan, L. (2002). Learning and adaptive characterization of visual contents in image retrieval systems. Handbook of neural network for signal processing. Boca Raton: CRC Press.
Mori, Y., Takahashi, H., Oka, R. (1999). Image-to-word transformation based on dividing and vector quantizing images with words. Proc. Int. Workshop on Multimedia Intelligent Storage and Retrieval Management.
Duygulu, P., Barnard, K., Freitas, N., Forsyth, D. (2002). Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. Proc. European Conf. Computer Vision, pp. 97–112.
Jeon, J., Lavrenko, V., Manmatha, R. (2003). Automatic image annotation and retrieval using cross-media relevance models. Proc. Int. ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 119–126.
Monay, F., Gatica-Perez, D. (2003). On image auto-annotation with latent space models. Proc. ACM Int. Conf. Multimedia, pp. 275–278.
Blei, D., Ng, A., & Jordan, M. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. doi:10.1162/jmlr.2003.3.4-5.993.
Blei, D., Jordan, M. (2003). Modeling annotated data. Proc. Int. ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 127–134.
Chang, E., Kingshy, G., Sychay, G., & Wu, G. (2003). CBSA: Content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Transactions on Circuits and Systems for Video Technology, 13(1), 26–38. doi:10.1109/TCSVT.2002.808079.
Li, J., & Wang, J. Z. (2003). Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 1075–1088. doi:10.1109/TPAMI.2003.1227984.
Goh, K., & Chang, E. (2005). Using one-class and two-class SVMs for multiclass image annotation. IEEE Transactions on Knowledge and Data Engineering, 17(10), 1333–1346. doi:10.1109/TKDE.2005.170.
Begelman, G., Keller, P., Smadja, F. (2006). Automated tag clustering: improving search and exploration in the tag space. 15th International World Wide Web Conference, Edinburgh, UK.
Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619. doi:10.1109/34.1000236.
Holland, J. H. (1975). Adaptation in Natural and Artificial Systems. Ann Arbor: University of Michigan Press.
Mitchell, M. (1996). An Introduction to Genetic Algorithms. Cambridge: MIT Press.
Findlay, J. (1980). The visual stimulus for saccadic eye movement in human observers. Perception, 9, 7–21. doi:10.1068/p090007.
Senders, J. (1997). Distribution of attention in static and dynamic scenes. Proc. SPIE, 3016, 186–194. doi:10.1117/12.274513.
Yarbus, A. (1967). Eye Movements and Vision. New York: Plenum Press.
Elias, G., Sherwin, G., Wise, J. (1984). Eye movements while viewing NTSC format television. SMPTE Psychophysics Subcommittee white paper.
Baker, J. E. (1987). Reducing bias and inefficiency in the selection algorithm. Proceedings of the 2 nd International Conference on Genetic Algorithms, pp.14–21.
Scholkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., Williamson, R. C. (1999). Estimating the support of a high-dimensional distribution. Technical report MSR-TR-99-87, Microsoft.
Haykin, S. (1999). Neural Networks a Comprehensive Foundation. Upper Saddle River: Prentice-Hall.
Rubner, Y., Tomasi, C., & Guibas, L. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40, 99–123. doi:10.1023/A:1026543900054.
Chiu, S. (1994). Fuzzy model identification based on cluster estimation. Journal of Intelligent & Fuzzy Systems, 2(3), 267–278.
Markus, S., Markus, O. (1995). Similarity of color images. Proc. SPIE Storage and Retrieval for Image and Video Databases, pp. 381–392.
Smith, J. R., Chang, S. F. (1996). Automated binary texture feature sets for image retrieval. Proc. Int. Conf. Acoustics, Speech, and Signal Processing, Atlanta, GA, pp. 2239–2242.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yap, KH., Wu, K. & Zhu, C. Knowledge Propagation in Collaborative Tagging for Image Retrieval. J Sign Process Syst Sign Image Video Technol 59, 163–175 (2010). https://doi.org/10.1007/s11265-008-0288-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-008-0288-1