Skip to main content
Log in

Scalable search-based image annotation

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

With the popularity of digital cameras, more and more people have accumulated considerable digital images on their personal devices. As a result, there are increasing needs to effectively search these personal images. Automatic image annotation may serve the goal, for the annotated keywords could facilitate the search processes. Although many image annotation methods have been proposed in recent years, their effectiveness on arbitrary personal images is constrained by their limited scalability, i.e. limited lexicon of small-scale training set. To be scalable, we propose a search-based image annotation algorithm that is analogous to information retrieval. First, content-based image retrieval technology is used to retrieve a set of visually similar images from a large-scale Web image set. Second, a text-based keyword search technique is used to obtain a ranked list of candidate annotations for each retrieved image. Third, a fusion algorithm is used to combine the ranked lists into a final candidate annotation list. Finally, the candidate annotations are re-ranked using Random Walk with Restarts and only the top ones are reserved as the final annotations. The application of both efficient search techniques and Web-scale image set guarantees the scalability of the proposed algorithm. Moreover, we provide an annotation rejection scheme to point out the images that our annotation system cannot handle well. Experimental results on U. Washington dataset show not only the effectiveness and efficiency of the proposed algorithm but also the advantage of image retrieval using annotation results over that using visual features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. http://www.cs.washington.edu/research/imagedatabase/groundtruth/

  2. http://www.photosig.com

  3. Baeza-Yates R.A., Ribeiro-Neto B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Reading (1999)

    Google Scholar 

  4. Blei, D.M., Jordan, M.I.: Modeling annotated data. In Proceedings of SIGIR, 2003

  5. Brown P.F., deSouza P.V., Mercer R.L., Pietra V.J., Lai J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18(4), 467–479 (1992)

    Google Scholar 

  6. Carneiro, G., Vasconcelos, N.: Formulating semantic image annotation as a supervised learning problem. In: Proceedings of CVPR (2005)

  7. Carneiro, G., Vasconcelos, N.: A database centric view of semantic image annotation. In: Proceedings of SIGIR (2005)

  8. Chang E., Kingshy G., Sychay G., Wu G.: CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans. CSVT 13(1), 26–38 (2003)

    Google Scholar 

  9. Cusano, C., Ciocca, G., Schettini, R.: Image annotation using SVM. In: Proceedings of Internet Imaging, vol. IV SPIE (2004)

  10. Duygulu, P., Barnard, K.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of the 7th European Conference on Computer Vision, vol. 4, pp. 97–112 (2002)

  11. Fan, X., Xie, X., Li, Z., Li, M., Ma, W.Y.: Photo-to-Search: using multimodal queries to search the web from mobile devices. In: Proceedings of the 7th ACM SIGMM Workshop on MIR (2005)

  12. Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: Proceedings Of CVPR (2004)

  13. Ferhatosmanoglu, H., Tuncel, E., Agrawal, D., Abbadi, A.E.: Approximate nearest neighbor searching in multimedia databases. In Proceedings of the 17th IEEE International Conference on Data Engineering, Heidelberg, pp. 503–511 (2001)

  14. Gao, Y., Fan, J., Luo, H., Xue, X., Jain, R.: Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers. In: Proceedings of ACM Multimedia, Santa Barbara (2006)

  15. Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of ACM SIGIR (2003)

  16. Jeon, J., Manmatha, R.: Automatic image annotation of news images with large vocabularies and low quality training data. In: Proceedings of ACM Multimedia (2004)

  17. Jin, Y., Khan, L., Wang, L., Awad, M.: Image annotations by combining multiple Evidence & Wordnet. In: Proceedings of ACM Multimedia, Singapore (2005)

  18. Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: Proceedings of the 17th Annual Conference on Neural Information Processing Systems (2003)

  19. Li J., Wang J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1075–1088 (2003)

    Article  Google Scholar 

  20. Li, J., Wang, J.Z.: Real-time computerized annotation of pictures. In: Proceedings of the 14th Annual ACM international Conference on Multimedia, Santa Barbara (2006)

  21. Li, X., Chen, L., Zhang, L., Lin, F., Ma, W.Y.: Image annotation by large-scale content-based image retrieval. In: Proceedings of ACM Multimedia, Santa Barbara (2006)

  22. Miller G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  23. Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: MISRM’99 (1999)

  24. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the Web, technical report. Stanford University, Stanford (1998)

  25. Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: Proceedings of SIGIR, pp. 345–354. Springer, Heidelberg (1994)

  26. Smeulders A.W.M., Worring M., Santini S., Gupta A., Jain R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)

    Article  Google Scholar 

  27. Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)

  28. Wang, C., Jing, F., Zhang, L., Zhang, H.: Content-based image annotation refinement. In: Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR. IEEE Computer Society, Minneapolis (2007)

  29. Wang, C., Jing, F., Zhang, L., Zhang, H.J.: Image annotation refinement using random walk with restarts. In: Proceedings of ACM Multimedia (2006)

  30. Wang, C., Jing, F., Zhang, L., Zhang, H.J.: Scalable search-based image annotation of personal images. In: ACM SIGMM International Workshop on Multimedia Information Retrieval, Santa Barbara (2006)

  31. Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: AnnoSearch: image auto-annotation by search. In: International Conference on Computer Vision and Pattern Recognition, New York (2006)

  32. Xu, J., Croft, W.B.: Querying expansion using local and global document analysis. In: Proceedings of the 19th International Conference on Research and Development in Information Retrieval (1996)

  33. Yang, C., Dong, M., Hua, J.: Image annotation using asymmetrical support vector machine-based multiple-instance learning. In: Proceedings of CVPR (2006)

  34. Yeh, T., Tollmar, K., Darrell, T.: Searching the Web with mobile images for location recognition. In: International Conference on Computer Vision and Pattern Recognition (2004)

  35. Zeng, H., He, Q., Chen, Z., Ma, W., Ma, J.: Learning to cluster web search results. In: Proceedings of SIGIR, New York, pp. 210–217 (2004)

  36. Zhang, L., Chen, L., Jing, F., Deng, K.F., Ma, W.Y.: EnjoyPhoto—a vertical image search engine for enjoying high-quality photos. In: Proceedings of ACM multimedia (2006)

  37. Zhang, L., Hu, Y., Li, M., Ma, W., Zhang, H.: Efficient propagation for face annotation in family albums. In: Proceedings of ACM multimedia, New York, pp. 716–723 (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changhu Wang.

Additional information

Communicated by E. Chang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, C., Jing, F., Zhang, L. et al. Scalable search-based image annotation. Multimedia Systems 14, 205–220 (2008). https://doi.org/10.1007/s00530-008-0128-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-008-0128-y

Keywords

Navigation