Abstract
Cross-domain visual matching aims at finding visually similar images across a wide range of visual domains, and has shown a practical impact on a number of applications. Unfortunately, the state-of-the-art approach, which estimates the relative importance of the single feature dimensions still suffers from low matching accuracy and high time cost. To this end, this paper proposes a novel cross-domain visual matching framework leveraging multiple feature representations. To integrate the discriminative power of multiple features, we develop a data-driven, query specific feature fusion model, which estimates the relative importance of the individual feature dimensions as well as the weight vector among multiple features simultaneously. Moreover, to alleviate the computational burden of an exhaustive subimage search, we design a speedup scheme, which employs hyperplane hashing for rapidly collecting the hard-negatives. Extensive experiments carried out on various matching tasks demonstrate that the proposed approach outperforms the state-of-the-art in both accuracy and efficiency.
Similar content being viewed by others
References
Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the 21th International Conference on Machine Learning (ICML), pp. 6–13 (2004)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Cao, Y., Wang, C., Zhang, L., Zhang, L.: Edgel index for large-scale sketch-based image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 761–768 (2011)
Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2Photo: Internet image montage. ACM Trans. Graph. 28(5), 124 (2009)
Chong, H.Y., Gortler, S.J., Zickler, T.: A perception-based color space for illumination-invariant image processing. ACM Trans. Graph. 27(3), 61 (2008) (SIGGRAPH)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Vis. Comput. Graph. 17(11), 1624–1636 (2011)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases (VLDB), pp. 518–529 (1999)
Ha, J.Y., Kim, G.Y., Choi, H.I.: The content-based image retrieval method using multiple features. In: International Conference on Networked Computing and Advanced Information Management (NCM), vol. 1, pp. 652–657 (2008)
Hauagge, D.C., Snavely, N.: Image matching using local symmetry features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 206–213 (2012)
Hays, J., Efros, A.A.: Scene completion using millions of photographs. ACM Trans. Graph. 26(3), 4 (2007)
Hays, J., Efros, A.A.: Im2gps: estimating geographic information from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC), pp. 604–613 (1998)
Jain, P., Vijayanarasimhan, S., Grauman, K.: Hashing hyperplane queries to near points with applications to large-scale active learning. In: Advances in Neural Information Processing Systems (NIPS), vol. 23 (2010)
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision (ECCV), pp. 304–317 (2008)
Johnson, M.K., Dale, K., Avidan, S., Pfister, H., Freeman, W.T., Matusik, W.: CG2Real: improving the realism of computer generated images using a large collection of photographs. IEEE Trans. Vis. Comput. Graph. 17(9), 1273–1285 (2011)
Lanckriet, G.R.G., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 2169–2178 (2006)
Liu, W., Wang, J., Mu, Y., Kumar, S., Chang, S.F.: Compact hyperplane hashing with bilinear functions. In: Proceedings of the 29th International Conference on Machine Learning (ICML), pp. 17–24 (2012)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Rakotomamonjy, A., Bach, F., Canu, S., Grandvalet, Y.: SimpleMKL. J. Mach. Learn. Res. 9, 2491–2521 (2008)
Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8(5), 644–655 (1998)
Russell, B.C., Sivic, J., Ponce, J., Dessales, H.: Automatic alignment of paintings and photographs depicting a 3D scene. In: 3rd International IEEE Workshop on 3D Representation for Recognition (3dRR) (2011)
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Shrivastava, A., Malisiewicz, T., Gupta, A., Efros, A.A.: Data-driven visual similarity for cross-domain image matching. ACM Trans. Graph. 30(6), 154 (2011) (SIGGRAPH Asia)
Vadivel, A., Sural, S., Majumdar, A.K.: Image retrieval from the web using multiple features. Online Inf. Rev. 33(6), 1169–1188 (2009)
Vishwanathan, S.V.N., Sun, Z., Theera-Ampornpunt, N., Varma, M.: Multiple kernel learning and the SMO algorithm. In: Advances in Neural Information Processing Systems (NIPS), vol. 23 (2010)
Wang, S., Huang, Q., Jiang, S., Tian, Q.: S3MKL: scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Trans. Multimedia 14(4), 1259–1274 (2012)
Acknowledgements
The authors would like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of the paper. This research is supported by National Fundamental Research Grant 973 Program (2009CB320802), NSFC grant (61272326, 61025011), and Research Grant of University of Macau. Image credits: Andy Carvin, Bob Pejman, Leonid Afremov, Mariko Jesse, Risto-Jussi Isopahkala, Steven Allen, www.turbosquid.com.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sun, G., Wang, S., Liu, X. et al. Accurate and efficient cross-domain visual matching leveraging multiple feature representations. Vis Comput 29, 565–575 (2013). https://doi.org/10.1007/s00371-013-0818-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-013-0818-0