Abstract
The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting conditions, paintings, hand-drawn sketches, etc. We propose a surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness". We employ standard tools from discriminative object detection in a novel way, yielding a generic approach that does not depend on a particular image representation or a specific visual domain. Our approach shows good performance on a number of difficult cross-domain visual tasks e.g., matching paintings or sketches to real photographs. The method also allows us to demonstrate novel applications such as Internet re-photography, and painting2gps. While at present the technique is too computationally intensive to be practical for interactive image retrieval, we hope that some of the ideas will eventually become applicable to that domain as well.
- Bae, S., Agarwala, A., and Durand, F. 2010. Computational rephotography. ACM Trans. Graph. 29 (July), 24:1--24:15. Google ScholarDigital Library
- Baeza-Yates, R. A., and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing. Google ScholarDigital Library
- Boiman, O., and Irani, M. 2007. Detecting irregularities in images and in video. In IJCV. Google ScholarDigital Library
- Buades, A., Coll, B., and Morel, J.-M. 2005. A non-local algorithm for image denoising. In CVPR. Google ScholarDigital Library
- Chang, C.-C., and Lin, C.-J. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. Google ScholarDigital Library
- Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2photo: internet image montage. ACM Trans. Graph. 28. Google ScholarDigital Library
- Chong, H., Gortler, S., and Zickler, T. 2008. A perception-based color space for illumination-invariant image processing. In Proceedings of SIGGRAPH. Google ScholarDigital Library
- Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In CVPR. Google ScholarDigital Library
- Dale, K., Johnson, M. K., Sunkavalli, K., Matusik, W., and Pfister, H. 2009. Image restoration using online photo collections. In ICCV.Google Scholar
- Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv.. Google ScholarDigital Library
- Efros, A. A., and Freeman, W. T. 2001. Image quilting for texture synthesis and transfer. In SIGGRAPH, Computer Graphics Proceedings, Annual Conference Series. Google ScholarDigital Library
- Eitz, M., Hildebrand, K., Boubekeur, T., and Alexa, M. 2010. Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE TVCG. Google ScholarDigital Library
- Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., and Zisserman, A., 2007. The PASCAL Visual Object Classes Challenge.Google Scholar
- Freeman, W. T., Jones, T. R., and Pasztor, E. C. 2002. Example-based super-resolution. IEEE Computer Graphics Applications. Google ScholarDigital Library
- HaCohen, Y., Fattal, R., and Lischinski, D. 2010. Image upsampling via texture hallucination. In ICCP.Google Scholar
- Hays, J., and Efros, A. A. 2007. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH). Google ScholarDigital Library
- Hays, J., and Efros, A. A. 2008. im2gps: estimating geographic information from a single image. In CVPR.Google Scholar
- Hertzmann, A., Jacobs, C., Oliver, N., Curless, B., and Salesin, D. 2001. Image analogies. In SIGGRAPH. Google ScholarDigital Library
- Hoiem, D., Sukthankar, R., Schneiderman, H., and Huston, L. 2004. Object-based image retrieval using the statistical structure of images. In CVPR. Google ScholarDigital Library
- Itti, L., and Koch, C. 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research.Google Scholar
- Jégou, H., Douze, M., and Schmid, C. 2008. Hamming embedding and weak geometric consistency for large scale image search. In ECCV.Google Scholar
- Johnson, M. K., Dale, K., Avidan, S., Pfister, H., Freeman, W. T., and Matusik, W. 2010. CG2real: Improving the realism of computer generated images using a large collection of photographs. IEEE TVCG. Google ScholarDigital Library
- Judd, T., Ehinger, K., Durand, F., and Torralba, A. 2009. Learning to predict where humans look. In ICCV.Google Scholar
- Kaneva, B., Sivic, J., Torralba, A., Avidan, S., and Freeman, W. T. 2010. Infinite images: Creating and exploring a large photorealistic virtual space. Proceedings of the IEEE.Google Scholar
- Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., and Seitz, S. M. 2011. Exploring photobios. In SIGGRAPH. Google ScholarDigital Library
- Lazebnik, S., Schmid, C., and Ponce, J. 2009. Spatial pyramid matching. In Object Categorization: Computer and Human Vision Perspectives. Cambridge University Press.Google Scholar
- Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. IJCV. Google ScholarDigital Library
- Malisiewicz, T., and Efros, A. A. 2009. Beyond categories: The visual memex model for reasoning about object relationships. In NIPS.Google Scholar
- Malisiewicz, T., Gupta, A., and Efros, A. A. 2011. Ensemble of exemplar-svms for object detection and beyond. In ICCV. Google ScholarDigital Library
- Oliva, A., and Torralba, A. 2006. Building the gist of a scene: the role of global image features in recognition. Progress in Brain Research.Google Scholar
- Russell, B. C., Sivic, J., Ponce, J., and Dessales, H. 2011. Automatic alignment of paintings and photographs depicting a 3d scene. In 3D Representation and Recognition (3dRR).Google Scholar
- Schodl, A., Szeliski, R., Salesin, D. H., and Essa, I. 2000. Video textures. In SIGGRAPH. Google ScholarDigital Library
- Shechtman, E., and Irani, M. 2007. Matching local self-similarities across images and videos. In CVPR.Google Scholar
- Sivic, J., and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In ICCV. Google ScholarDigital Library
- Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. 2008. Finding paths through the world's photos. ACM Transactions on Graphics. Google ScholarDigital Library
- Tieu, K., and Viola, P. 2004. Boosting image retrieval. IJCV. Google ScholarDigital Library
- Torralba, A., Fergus, R., and Freeman, W. T. 2008. 80 million tiny images: a large database for non-parametric object and scene recognition. IEEE PAMI. Google ScholarDigital Library
- Wexler, Y., Shechtman, E., and Irani, M. Space-time completion of video. IEEE PAMI. Google ScholarDigital Library
- Whyte, O., Sivic, J., and Zisserman, A. 2009. Get out of my picture! internet-based inpainting. In BMVC.Google Scholar
- Wolf, L., Hassner, T., and Taigman, Y. 2009. The one-shot similarity kernel. In ICCV.Google Scholar
Index Terms
- Data-driven visual similarity for cross-domain image matching
Recommendations
Data-driven visual similarity for cross-domain image matching
SA '11: Proceedings of the 2011 SIGGRAPH Asia ConferenceThe goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting ...
Efficient Cross-Domain Image Retrieval by Multi-Level Matching and Spatial Verification for Structural Similarity
MM '14: Proceedings of the 22nd ACM international conference on MultimediaContent-based image retrieval (CBIR) technique is important for browsing the rapidly growing Web images. However, traditional CBIR methods usually fail when the query and database images are in different domains. Instead of focusing on a specific domain,...
Visual saliency based bag of phrases for image retrival
VRCAI '14: Proceedings of the 13th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in IndustryThis paper presents a saliency based bag-of-phrases (Saliency-BoP for short) method for image retrieval. It combines saliency detection with visual phrase construction to extract bag-of-phrase features. To achieve this, the method first detects salient ...
Comments