Skip to main content
Log in

Accurate and efficient cross-domain visual matching leveraging multiple feature representations

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Cross-domain visual matching aims at finding visually similar images across a wide range of visual domains, and has shown a practical impact on a number of applications. Unfortunately, the state-of-the-art approach, which estimates the relative importance of the single feature dimensions still suffers from low matching accuracy and high time cost. To this end, this paper proposes a novel cross-domain visual matching framework leveraging multiple feature representations. To integrate the discriminative power of multiple features, we develop a data-driven, query specific feature fusion model, which estimates the relative importance of the individual feature dimensions as well as the weight vector among multiple features simultaneously. Moreover, to alleviate the computational burden of an exhaustive subimage search, we design a speedup scheme, which employs hyperplane hashing for rapidly collecting the hard-negatives. Extensive experiments carried out on various matching tasks demonstrate that the proposed approach outperforms the state-of-the-art in both accuracy and efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21th International Conference on Machine Learning (ICML), pp. 6–13 (2004)

    Google Scholar 

  2. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)

    Article  Google Scholar 

  3. Cao, Y., Wang, C., Zhang, L., Zhang, L.: Edgel index for large-scale sketch-based image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 761–768 (2011)

    Google Scholar 

  4. Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2Photo: Internet image montage. ACM Trans. Graph. 28(5), 124 (2009)

    Google Scholar 

  5. Chong, H.Y., Gortler, S.J., Zickler, T.: A perception-based color space for illumination-invariant image processing. ACM Trans. Graph. 27(3), 61 (2008) (SIGGRAPH)

    Article  Google Scholar 

  6. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)

    Google Scholar 

  7. Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Vis. Comput. Graph. 17(11), 1624–1636 (2011)

    Article  Google Scholar 

  8. Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), pp. 518–529 (1999)

    Google Scholar 

  9. Ha, J.Y., Kim, G.Y., Choi, H.I.: The content-based image retrieval method using multiple features. In: International Conference on Networked Computing and Advanced Information Management (NCM), vol. 1, pp. 652–657 (2008)

    Google Scholar 

  10. Hauagge, D.C., Snavely, N.: Image matching using local symmetry features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 206–213 (2012)

    Google Scholar 

  11. Hays, J., Efros, A.A.: Scene completion using millions of photographs. ACM Trans. Graph. 26(3), 4 (2007)

    Article  Google Scholar 

  12. Hays, J., Efros, A.A.: Im2gps: estimating geographic information from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)

    Google Scholar 

  13. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC), pp. 604–613 (1998)

    Google Scholar 

  14. Jain, P., Vijayanarasimhan, S., Grauman, K.: Hashing hyperplane queries to near points with applications to large-scale active learning. In: Advances in Neural Information Processing Systems (NIPS), vol. 23 (2010)

  15. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision (ECCV), pp. 304–317 (2008)

    Google Scholar 

  16. Johnson, M.K., Dale, K., Avidan, S., Pfister, H., Freeman, W.T., Matusik, W.: CG2Real: improving the realism of computer generated images using a large collection of photographs. IEEE Trans. Vis. Comput. Graph. 17(9), 1273–1285 (2011)

    Article  Google Scholar 

  17. Lanckriet, G.R.G., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)

    MATH  Google Scholar 

  18. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2169–2178 (2006)

    Google Scholar 

  19. Liu, W., Wang, J., Mu, Y., Kumar, S., Chang, S.F.: Compact hyperplane hashing with bilinear functions. In: Proceedings of the 29th International Conference on Machine Learning (ICML), pp. 17–24 (2012)

    Google Scholar 

  20. Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  21. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  Google Scholar 

  22. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  23. Rakotomamonjy, A., Bach, F., Canu, S., Grandvalet, Y.: SimpleMKL. J. Mach. Learn. Res. 9, 2491–2521 (2008)

    MathSciNet  MATH  Google Scholar 

  24. Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8(5), 644–655 (1998)

    Article  Google Scholar 

  25. Russell, B.C., Sivic, J., Ponce, J., Dessales, H.: Automatic alignment of paintings and photographs depicting a 3D scene. In: 3rd International IEEE Workshop on 3D Representation for Recognition (3dRR) (2011)

    Google Scholar 

  26. Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)

    Google Scholar 

  27. Shrivastava, A., Malisiewicz, T., Gupta, A., Efros, A.A.: Data-driven visual similarity for cross-domain image matching. ACM Trans. Graph. 30(6), 154 (2011) (SIGGRAPH Asia)

    Article  Google Scholar 

  28. Vadivel, A., Sural, S., Majumdar, A.K.: Image retrieval from the web using multiple features. Online Inf. Rev. 33(6), 1169–1188 (2009)

    Article  Google Scholar 

  29. Vishwanathan, S.V.N., Sun, Z., Theera-Ampornpunt, N., Varma, M.: Multiple kernel learning and the SMO algorithm. In: Advances in Neural Information Processing Systems (NIPS), vol. 23 (2010)

    Google Scholar 

  30. Wang, S., Huang, Q., Jiang, S., Tian, Q.: S3MKL: scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Trans. Multimedia 14(4), 1259–1274 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. This research is supported by National Fundamental Research Grant 973 Program (2009CB320802), NSFC grant (61272326, 61025011), and Research Grant of University of Macau. Image credits: Andy Carvin, Bob Pejman, Leonid Afremov, Mariko Jesse, Risto-Jussi Isopahkala, Steven Allen, www.turbosquid.com.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, G., Wang, S., Liu, X. et al. Accurate and efficient cross-domain visual matching leveraging multiple feature representations. Vis Comput 29, 565–575 (2013). https://doi.org/10.1007/s00371-013-0818-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-013-0818-0

Keywords

Navigation