skip to main content
research-article

Data-driven visual similarity for cross-domain image matching

Published:12 December 2011Publication History
Skip Abstract Section

Abstract

The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting conditions, paintings, hand-drawn sketches, etc. We propose a surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness". We employ standard tools from discriminative object detection in a novel way, yielding a generic approach that does not depend on a particular image representation or a specific visual domain. Our approach shows good performance on a number of difficult cross-domain visual tasks e.g., matching paintings or sketches to real photographs. The method also allows us to demonstrate novel applications such as Internet re-photography, and painting2gps. While at present the technique is too computationally intensive to be practical for interactive image retrieval, we hope that some of the ideas will eventually become applicable to that domain as well.

References

  1. Bae, S., Agarwala, A., and Durand, F. 2010. Computational rephotography. ACM Trans. Graph. 29 (July), 24:1--24:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Baeza-Yates, R. A., and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Boiman, O., and Irani, M. 2007. Detecting irregularities in images and in video. In IJCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Buades, A., Coll, B., and Morel, J.-M. 2005. A non-local algorithm for image denoising. In CVPR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chang, C.-C., and Lin, C.-J. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2photo: internet image montage. ACM Trans. Graph. 28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chong, H., Gortler, S., and Zickler, T. 2008. A perception-based color space for illumination-invariant image processing. In Proceedings of SIGGRAPH. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In CVPR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dale, K., Johnson, M. K., Sunkavalli, K., Matusik, W., and Pfister, H. 2009. Image restoration using online photo collections. In ICCV.Google ScholarGoogle Scholar
  10. Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv.. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Efros, A. A., and Freeman, W. T. 2001. Image quilting for texture synthesis and transfer. In SIGGRAPH, Computer Graphics Proceedings, Annual Conference Series. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Eitz, M., Hildebrand, K., Boubekeur, T., and Alexa, M. 2010. Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE TVCG. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., and Zisserman, A., 2007. The PASCAL Visual Object Classes Challenge.Google ScholarGoogle Scholar
  14. Freeman, W. T., Jones, T. R., and Pasztor, E. C. 2002. Example-based super-resolution. IEEE Computer Graphics Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. HaCohen, Y., Fattal, R., and Lischinski, D. 2010. Image upsampling via texture hallucination. In ICCP.Google ScholarGoogle Scholar
  16. Hays, J., and Efros, A. A. 2007. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hays, J., and Efros, A. A. 2008. im2gps: estimating geographic information from a single image. In CVPR.Google ScholarGoogle Scholar
  18. Hertzmann, A., Jacobs, C., Oliver, N., Curless, B., and Salesin, D. 2001. Image analogies. In SIGGRAPH. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hoiem, D., Sukthankar, R., Schneiderman, H., and Huston, L. 2004. Object-based image retrieval using the statistical structure of images. In CVPR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Itti, L., and Koch, C. 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research.Google ScholarGoogle Scholar
  21. Jégou, H., Douze, M., and Schmid, C. 2008. Hamming embedding and weak geometric consistency for large scale image search. In ECCV.Google ScholarGoogle Scholar
  22. Johnson, M. K., Dale, K., Avidan, S., Pfister, H., Freeman, W. T., and Matusik, W. 2010. CG2real: Improving the realism of computer generated images using a large collection of photographs. IEEE TVCG. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Judd, T., Ehinger, K., Durand, F., and Torralba, A. 2009. Learning to predict where humans look. In ICCV.Google ScholarGoogle Scholar
  24. Kaneva, B., Sivic, J., Torralba, A., Avidan, S., and Freeman, W. T. 2010. Infinite images: Creating and exploring a large photorealistic virtual space. Proceedings of the IEEE.Google ScholarGoogle Scholar
  25. Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., and Seitz, S. M. 2011. Exploring photobios. In SIGGRAPH. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lazebnik, S., Schmid, C., and Ponce, J. 2009. Spatial pyramid matching. In Object Categorization: Computer and Human Vision Perspectives. Cambridge University Press.Google ScholarGoogle Scholar
  27. Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. IJCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Malisiewicz, T., and Efros, A. A. 2009. Beyond categories: The visual memex model for reasoning about object relationships. In NIPS.Google ScholarGoogle Scholar
  29. Malisiewicz, T., Gupta, A., and Efros, A. A. 2011. Ensemble of exemplar-svms for object detection and beyond. In ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Oliva, A., and Torralba, A. 2006. Building the gist of a scene: the role of global image features in recognition. Progress in Brain Research.Google ScholarGoogle Scholar
  31. Russell, B. C., Sivic, J., Ponce, J., and Dessales, H. 2011. Automatic alignment of paintings and photographs depicting a 3d scene. In 3D Representation and Recognition (3dRR).Google ScholarGoogle Scholar
  32. Schodl, A., Szeliski, R., Salesin, D. H., and Essa, I. 2000. Video textures. In SIGGRAPH. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Shechtman, E., and Irani, M. 2007. Matching local self-similarities across images and videos. In CVPR.Google ScholarGoogle Scholar
  34. Sivic, J., and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In ICCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. 2008. Finding paths through the world's photos. ACM Transactions on Graphics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tieu, K., and Viola, P. 2004. Boosting image retrieval. IJCV. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Torralba, A., Fergus, R., and Freeman, W. T. 2008. 80 million tiny images: a large database for non-parametric object and scene recognition. IEEE PAMI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wexler, Y., Shechtman, E., and Irani, M. Space-time completion of video. IEEE PAMI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Whyte, O., Sivic, J., and Zisserman, A. 2009. Get out of my picture! internet-based inpainting. In BMVC.Google ScholarGoogle Scholar
  40. Wolf, L., Hassner, T., and Taigman, Y. 2009. The one-shot similarity kernel. In ICCV.Google ScholarGoogle Scholar

Index Terms

  1. Data-driven visual similarity for cross-domain image matching

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader