Accurate and efficient cross-domain visual matching leveraging multiple feature representations

Sun, Gang; Wang, Shuhui; Liu, Xuehui; Huang, Qingming; Chen, Yanyun; Wu, Enhua

doi:10.1007/s00371-013-0818-0

Accurate and efficient cross-domain visual matching leveraging multiple feature representations

Original Article
Published: 25 April 2013

Volume 29, pages 565–575, (2013)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Gang Sun^1,2,
Shuhui Wang³,
Xuehui Liu¹,
Qingming Huang^2,3,
Yanyun Chen¹ &
…
Enhua Wu^1,4

639 Accesses
2 Citations
Explore all metrics

Abstract

Cross-domain visual matching aims at finding visually similar images across a wide range of visual domains, and has shown a practical impact on a number of applications. Unfortunately, the state-of-the-art approach, which estimates the relative importance of the single feature dimensions still suffers from low matching accuracy and high time cost. To this end, this paper proposes a novel cross-domain visual matching framework leveraging multiple feature representations. To integrate the discriminative power of multiple features, we develop a data-driven, query specific feature fusion model, which estimates the relative importance of the individual feature dimensions as well as the weight vector among multiple features simultaneously. Moreover, to alleviate the computational burden of an exhaustive subimage search, we design a speedup scheme, which employs hyperplane hashing for rapidly collecting the hard-negatives. Extensive experiments carried out on various matching tasks demonstrate that the proposed approach outperforms the state-of-the-art in both accuracy and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-View Feature Hashing for Image Retrieval

Hybrid-Indexing Multi-type Features for Large-Scale Image Search

Exploiting Concept Correlation with Attributes for Semantic Binary Representation Learning

References

Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21th International Conference on Machine Learning (ICML), pp. 6–13 (2004)
Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Article Google Scholar
Cao, Y., Wang, C., Zhang, L., Zhang, L.: Edgel index for large-scale sketch-based image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 761–768 (2011)
Google Scholar
Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2Photo: Internet image montage. ACM Trans. Graph. 28(5), 124 (2009)
Google Scholar
Chong, H.Y., Gortler, S.J., Zickler, T.: A perception-based color space for illumination-invariant image processing. ACM Trans. Graph. 27(3), 61 (2008) (SIGGRAPH)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
Google Scholar
Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Vis. Comput. Graph. 17(11), 1624–1636 (2011)
Article Google Scholar
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), pp. 518–529 (1999)
Google Scholar
Ha, J.Y., Kim, G.Y., Choi, H.I.: The content-based image retrieval method using multiple features. In: International Conference on Networked Computing and Advanced Information Management (NCM), vol. 1, pp. 652–657 (2008)
Google Scholar
Hauagge, D.C., Snavely, N.: Image matching using local symmetry features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 206–213 (2012)
Google Scholar
Hays, J., Efros, A.A.: Scene completion using millions of photographs. ACM Trans. Graph. 26(3), 4 (2007)
Article Google Scholar
Hays, J., Efros, A.A.: Im2gps: estimating geographic information from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC), pp. 604–613 (1998)
Google Scholar
Jain, P., Vijayanarasimhan, S., Grauman, K.: Hashing hyperplane queries to near points with applications to large-scale active learning. In: Advances in Neural Information Processing Systems (NIPS), vol. 23 (2010)
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision (ECCV), pp. 304–317 (2008)
Google Scholar
Johnson, M.K., Dale, K., Avidan, S., Pfister, H., Freeman, W.T., Matusik, W.: CG2Real: improving the realism of computer generated images using a large collection of photographs. IEEE Trans. Vis. Comput. Graph. 17(9), 1273–1285 (2011)
Article Google Scholar
Lanckriet, G.R.G., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)
MATH Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2169–2178 (2006)
Google Scholar
Liu, W., Wang, J., Mu, Y., Kumar, S., Chang, S.F.: Compact hyperplane hashing with bilinear functions. In: Proceedings of the 29th International Conference on Machine Learning (ICML), pp. 17–24 (2012)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Article Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Article MATH Google Scholar
Rakotomamonjy, A., Bach, F., Canu, S., Grandvalet, Y.: SimpleMKL. J. Mach. Learn. Res. 9, 2491–2521 (2008)
MathSciNet MATH Google Scholar
Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8(5), 644–655 (1998)
Article Google Scholar
Russell, B.C., Sivic, J., Ponce, J., Dessales, H.: Automatic alignment of paintings and photographs depicting a 3D scene. In: 3rd International IEEE Workshop on 3D Representation for Recognition (3dRR) (2011)
Google Scholar
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Google Scholar
Shrivastava, A., Malisiewicz, T., Gupta, A., Efros, A.A.: Data-driven visual similarity for cross-domain image matching. ACM Trans. Graph. 30(6), 154 (2011) (SIGGRAPH Asia)
Article Google Scholar
Vadivel, A., Sural, S., Majumdar, A.K.: Image retrieval from the web using multiple features. Online Inf. Rev. 33(6), 1169–1188 (2009)
Article Google Scholar
Vishwanathan, S.V.N., Sun, Z., Theera-Ampornpunt, N., Varma, M.: Multiple kernel learning and the SMO algorithm. In: Advances in Neural Information Processing Systems (NIPS), vol. 23 (2010)
Google Scholar
Wang, S., Huang, Q., Jiang, S., Tian, Q.: S3MKL: scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Trans. Multimedia 14(4), 1259–1274 (2012)
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. This research is supported by National Fundamental Research Grant 973 Program (2009CB320802), NSFC grant (61272326, 61025011), and Research Grant of University of Macau. Image credits: Andy Carvin, Bob Pejman, Leonid Afremov, Mariko Jesse, Risto-Jussi Isopahkala, Steven Allen, www.turbosquid.com.

Author information

Authors and Affiliations

State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China
Gang Sun, Xuehui Liu, Yanyun Chen & Enhua Wu
University of Chinese Academy of Sciences, Beijing, China
Gang Sun & Qingming Huang
Key Laboratory of Intelligent Information Processing (CAS), Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Shuhui Wang & Qingming Huang
University of Macau, Macao, China
Enhua Wu

Authors

Gang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Shuhui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xuehui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qingming Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yanyun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Enhua Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gang Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, G., Wang, S., Liu, X. et al. Accurate and efficient cross-domain visual matching leveraging multiple feature representations. Vis Comput 29, 565–575 (2013). https://doi.org/10.1007/s00371-013-0818-0

Download citation

Published: 25 April 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s00371-013-0818-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accurate and efficient cross-domain visual matching leveraging multiple feature representations

Abstract

Access this article

Similar content being viewed by others

Cross-View Feature Hashing for Image Retrieval

Hybrid-Indexing Multi-type Features for Large-Scale Image Search

Exploiting Concept Correlation with Attributes for Semantic Binary Representation Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accurate and efficient cross-domain visual matching leveraging multiple feature representations

Abstract

Access this article

Similar content being viewed by others

Cross-View Feature Hashing for Image Retrieval

Hybrid-Indexing Multi-type Features for Large-Scale Image Search

Exploiting Concept Correlation with Attributes for Semantic Binary Representation Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation