Abstract
This paper presents a unified annotation and retrieval framework, which integrates region annotation with image retrieval for performance reinforcement. To integrate semantic annotation with region-based image retrieval, visual and textual fusion is proposed for both soft matching and Bayesian probabilistic formulations. To address sample insufficiency and sample asymmetry in the annotation classifier training phase, we present a region-level multi-label image annotation scheme based on pair-wise coupling support vector machine (SVM) learning. In the retrieval phase, to achieve semantic-level region matching we present a novel retrieval scheme which differs from former work: the query example uploaded by users is automatically annotated online, and the user can judge its annotation quality. Based on the user’s judgment, two novel schemes are deployed for semantic retrieval: (1) if the user judges the photo to be well annotated, Semantically supervised Integrated Region Matching is adopted, which is a keyword-integrated soft region matching method; (2) If the user judges the photo to be poorly annotated, Keyword Integrated Bayesian Reasoning is adopted, which is a natural integration of a Visual Dictionary in online content-based search. In the relevance feedback phase, we conduct both visual and textual learning to capture the user’s retrieval target. Better annotation and retrieval performance than current methods were reported on both COREL 10,000 and Flickr web image database (25,000 images), which demonstrated the effectiveness of our proposed framework.
References
Carson, C., Belongie, S., Greenspan, H., Malik, J.: Region-based image querying. Proceeding of IEEE Workshop on Content-Based Access of Image and Video Libraries, pp. 42–49 (1997)
Chang E., Goh K., Sychay G., Wu G.: CBSA: Content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans. Circuits Syst. Video Technol. 13(1), 26–38 (2003)
Chen Y., Wang J.Z.: A region-based soft feature matching approach to content-based image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 24(9), 1252–1267 (2002)
Cox I.J., Miller M.L., Minka T.P., Papathomas T.V., Yianilos P.N.: The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments. IEEE Trans. Image Process. 9(1), 20–37 (2000)
Datta R., Ge W., Li J., Wang J.Z.: Toward bridging the annotation-retrieval gap in image search. IEEE Multimed. 14(3), 24–35 (2007)
Datta R., Joshi D., Li J., Wang J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 1–60 (2008)
Duygulu P., Barnard K., de Freitas N., Forsyth D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. Eur. Conf. Comput. Vis. 4, 97–112 (2002)
Goh K.-S., Chang E.Y., Li B.: Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans. Knowl. Data Eng. 17(10), 1333–1346 (2005)
Haralick R.M., Shanmugam K., Dinstein I.: Texture features for image classification. IEEE Trans. Syst. Man Cybern. 3, 610–621 (1973)
Huang, J., Kumar, S.R., Mitra, M., Zhu, W.J., Zabin, R.: Image indexing using color correlogram. IEEE Int. Conf. Comput. Vis. Pattern Recognit. pp. 762–768 (1997)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 119–126 (2003)
Ji R., Lang X., Yao H., Zhang Z.: Semantic sensitive region retrieval using keyword integrated Bayesian reasoning. Int. J. Innov. Comput. Inf. Control 3(6), 1645–1656 (2007)
Jing F., Li M., Zhang H.-J., Zhang B.: A unified framework for image retrieval using keyword and visual features. IEEE Trans. Image Process. 14(7), 979–989 (2000)
Lavrenko, R.M.V., Jeon, J.: A model for learning the semantics of pictures. Annual Conference on Neural Information Processing Systems (2003)
Li J., Wang J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1075–1088 (2003)
Lu Y., Zhang H.-J., Liu w., Hu C.: Joint semantic and feature based image retrieval using relevance feedback. IEEE Trans. Multimed. 5(3), 339–347 (2003)
Rahman M.M., Bhattacharya P., Desai B.C.: A framework for medical image retrieval using machine learning & statistical similarity matching techniques with relevance feedback. IEEE Trans. Inf. Technol. Biomed. 11(1), 58–69 (2007)
Smeulders A.W.M., Worring M., Santini S., Gupta A., Jain R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)
Tao D., Tang X., Li X., Wu X.: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1088–1099 (2006)
Tong, S., Chang, E.: Support vector machine active learning for image retrieval. Proceeding of ACM International Conference on Multimedia, pp. 107–118 (2001)
Vapnik V.N.: Statistical Learning Theory. Wiley, New York (1998)
Veltkamp, R.C., Tanase, M.: Content-based image retrieval systems: A survey, technical report UU-CS-2000–34, Department of Computing Science, Utrecht University (2000)
Wang J.Z., Li J., Wiederhold G.: SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans. Pattern Anal. Mach. Intell. 23(9), 947–963 (2001)
Wang T., Yong R., Sun J.-G.: Constraint based region matching for image retrieval. Int. J. Comput. Vis. 56(1/2/3), 37–45 (2004)
Wu T., Lin C.J., Weng R.C.: Probability estimates for multi-class classification by pairwise coupling. Int. J. Mach. Learn. Res. 10(5), 975–1005 (2004)
Yong R., Huang T.S., Mehrotra S., Ortega M.: Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8(5), 644–655 (1998)
Zhang R., Zhang Z.: Hidden semantic concept discovery in region based image retrieval. Comput. Vis. Pattern Recognit. 2, 996–1001 (2004)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Mohan Kankanhalli, Ph.D.
Rights and permissions
About this article
Cite this article
Ji, R., Yao, H., Xu, P. et al. Visual and textual fusion for semantically supervised region-based retrieval. Multimedia Systems 15, 201–219 (2009). https://doi.org/10.1007/s00530-009-0154-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-009-0154-4