research-article

Data-driven visual similarity for cross-domain image matching

Authors:
Abhinav Shrivastava

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Tomasz Malisiewicz

MIT

MIT
View Profile

,
Abhinav Gupta

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Alexei A. Efros

Carnegie Mellon University

Carnegie Mellon University
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 30 Issue 6pp 1–10https://doi.org/10.1145/2070781.2024188

Published:12 December 2011Publication History

ACM Transactions on Graphics

Abstract

The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting conditions, paintings, hand-drawn sketches, etc. We propose a surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness". We employ standard tools from discriminative object detection in a novel way, yielding a generic approach that does not depend on a particular image representation or a specific visual domain. Our approach shows good performance on a number of difficult cross-domain visual tasks e.g., matching paintings or sketches to real photographs. The method also allows us to demonstrate novel applications such as Internet re-photography, and painting2gps. While at present the technique is too computationally intensive to be practical for interactive image retrieval, we hope that some of the ideas will eventually become applicable to that domain as well.

References

Bae, S., Agarwala, A., and Durand, F. 2010. Computational rephotography. ACM Trans. Graph. 29 (July), 24:1--24:15. Google ScholarDigital Library
Baeza-Yates, R. A., and Ribeiro-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley Longman Publishing. Google ScholarDigital Library
Boiman, O., and Irani, M. 2007. Detecting irregularities in images and in video. In IJCV. Google ScholarDigital Library
Buades, A., Coll, B., and Morel, J.-M. 2005. A non-local algorithm for image denoising. In CVPR. Google ScholarDigital Library
Chang, C.-C., and Lin, C.-J. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. Google ScholarDigital Library
Chen, T., Cheng, M.-M., Tan, P., Shamir, A., and Hu, S.-M. 2009. Sketch2photo: internet image montage. ACM Trans. Graph. 28. Google ScholarDigital Library
Chong, H., Gortler, S., and Zickler, T. 2008. A perception-based color space for illumination-invariant image processing. In Proceedings of SIGGRAPH. Google ScholarDigital Library
Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In CVPR. Google ScholarDigital Library
Dale, K., Johnson, M. K., Sunkavalli, K., Matusik, W., and Pfister, H. 2009. Image restoration using online photo collections. In ICCV.Google Scholar
Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv.. Google ScholarDigital Library
Efros, A. A., and Freeman, W. T. 2001. Image quilting for texture synthesis and transfer. In SIGGRAPH, Computer Graphics Proceedings, Annual Conference Series. Google ScholarDigital Library
Eitz, M., Hildebrand, K., Boubekeur, T., and Alexa, M. 2010. Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE TVCG. Google ScholarDigital Library
Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., and Zisserman, A., 2007. The PASCAL Visual Object Classes Challenge.Google Scholar
Freeman, W. T., Jones, T. R., and Pasztor, E. C. 2002. Example-based super-resolution. IEEE Computer Graphics Applications. Google ScholarDigital Library
HaCohen, Y., Fattal, R., and Lischinski, D. 2010. Image upsampling via texture hallucination. In ICCP.Google Scholar
Hays, J., and Efros, A. A. 2007. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH). Google ScholarDigital Library
Hays, J., and Efros, A. A. 2008. im2gps: estimating geographic information from a single image. In CVPR.Google Scholar
Hertzmann, A., Jacobs, C., Oliver, N., Curless, B., and Salesin, D. 2001. Image analogies. In SIGGRAPH. Google ScholarDigital Library
Hoiem, D., Sukthankar, R., Schneiderman, H., and Huston, L. 2004. Object-based image retrieval using the statistical structure of images. In CVPR. Google ScholarDigital Library
Itti, L., and Koch, C. 2000. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research.Google Scholar
Jégou, H., Douze, M., and Schmid, C. 2008. Hamming embedding and weak geometric consistency for large scale image search. In ECCV.Google Scholar
Johnson, M. K., Dale, K., Avidan, S., Pfister, H., Freeman, W. T., and Matusik, W. 2010. CG2real: Improving the realism of computer generated images using a large collection of photographs. IEEE TVCG. Google ScholarDigital Library
Judd, T., Ehinger, K., Durand, F., and Torralba, A. 2009. Learning to predict where humans look. In ICCV.Google Scholar
Kaneva, B., Sivic, J., Torralba, A., Avidan, S., and Freeman, W. T. 2010. Infinite images: Creating and exploring a large photorealistic virtual space. Proceedings of the IEEE.Google Scholar
Kemelmacher-Shlizerman, I., Shechtman, E., Garg, R., and Seitz, S. M. 2011. Exploring photobios. In SIGGRAPH. Google ScholarDigital Library
Lazebnik, S., Schmid, C., and Ponce, J. 2009. Spatial pyramid matching. In Object Categorization: Computer and Human Vision Perspectives. Cambridge University Press.Google Scholar
Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. IJCV. Google ScholarDigital Library
Malisiewicz, T., and Efros, A. A. 2009. Beyond categories: The visual memex model for reasoning about object relationships. In NIPS.Google Scholar
Malisiewicz, T., Gupta, A., and Efros, A. A. 2011. Ensemble of exemplar-svms for object detection and beyond. In ICCV. Google ScholarDigital Library
Oliva, A., and Torralba, A. 2006. Building the gist of a scene: the role of global image features in recognition. Progress in Brain Research.Google Scholar
Russell, B. C., Sivic, J., Ponce, J., and Dessales, H. 2011. Automatic alignment of paintings and photographs depicting a 3d scene. In 3D Representation and Recognition (3dRR).Google Scholar
Schodl, A., Szeliski, R., Salesin, D. H., and Essa, I. 2000. Video textures. In SIGGRAPH. Google ScholarDigital Library
Shechtman, E., and Irani, M. 2007. Matching local self-similarities across images and videos. In CVPR.Google Scholar
Sivic, J., and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In ICCV. Google ScholarDigital Library
Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. 2008. Finding paths through the world's photos. ACM Transactions on Graphics. Google ScholarDigital Library
Tieu, K., and Viola, P. 2004. Boosting image retrieval. IJCV. Google ScholarDigital Library
Torralba, A., Fergus, R., and Freeman, W. T. 2008. 80 million tiny images: a large database for non-parametric object and scene recognition. IEEE PAMI. Google ScholarDigital Library
Wexler, Y., Shechtman, E., and Irani, M. Space-time completion of video. IEEE PAMI. Google ScholarDigital Library
Whyte, O., Sivic, J., and Zisserman, A. 2009. Get out of my picture! internet-based inpainting. In BMVC.Google Scholar
Wolf, L., Hassner, T., and Taigman, Y. 2009. The one-shot similarity kernel. In ICCV.Google Scholar

Index Terms

Data-driven visual similarity for cross-domain image matching
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
      2. Computer vision tasks
        Scene understanding

Recommendations

Data-driven visual similarity for cross-domain image matching
SA '11: Proceedings of the 2011 SIGGRAPH Asia Conference

The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level. This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting ...
Read More
Efficient Cross-Domain Image Retrieval by Multi-Level Matching and Spatial Verification for Structural Similarity
MM '14: Proceedings of the 22nd ACM international conference on Multimedia

Content-based image retrieval (CBIR) technique is important for browsing the rapidly growing Web images. However, traditional CBIR methods usually fail when the query and database images are in different domains. Instead of focusing on a specific domain,...
Read More
Visual saliency based bag of phrases for image retrival
VRCAI '14: Proceedings of the 13th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

This paper presents a saliency based bag-of-phrases (Saliency-BoP for short) method for image retrieval. It combines saliency detection with visual phrase construction to extract bag-of-phrase features. To achieve this, the method first detects salient ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 30, Issue 6
December 2011
678 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2070781
Issue’s Table of Contents

Copyright © 2011 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 December 2011
Published in tog Volume 30, Issue 6

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
image matching
image retrieval
paintings
re-photography
saliency
sketches
visual memex
visual similarity
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 130
  Total Citations
  View Citations
- 2,771
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Data-driven visual similarity for cross-domain image matching

ACM Transactions on Graphics

Abstract

References

Cited By

Index Terms

Recommendations

Data-driven visual similarity for cross-domain image matching

Efficient Cross-Domain Image Retrieval by Multi-Level Matching and Spatial Verification for Structural Similarity

Visual saliency based bag of phrases for image retrival

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Data-driven visual similarity for cross-domain image matching

ACM Transactions on Graphics

Abstract

References

Cited By

Index Terms

Recommendations

Data-driven visual similarity for cross-domain image matching

Efficient Cross-Domain Image Retrieval by Multi-Level Matching and Spatial Verification for Structural Similarity

Visual saliency based bag of phrases for image retrival

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media