Abstract
We examine an under-explored visual recognition problem, where we have a main view along with an auxiliary view of visual information present in the training data, but merely the main view is available in the test data. To effectively leverage the auxiliary view to train a stronger classifier, we propose a collaborative auxiliary learning framework based on a new discriminative canonical correlation analysis. This framework reveals a common semantic space shared across both views through enforcing a series of nonlinear projections. Such projections automatically embed the discriminative cues hidden in both views into the common space, and better visual recognition is thus achieved on the test data that stems from only the main view. The efficacy of our proposed auxiliary learning approach is demonstrated through three challenging visual recognition tasks with different kinds of auxiliary information.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Quanz, B., Huan, J.: Large margin transductive transfer learning. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1327–1336. ACM (2009)
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the 24th International Conference on Machine Learning, pp. 209–216. ACM (2007)
Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010)
Kulis, B., Saenko, K., Darrell, T.: What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1785–1792. IEEE (2011)
Farquhar, J., Hardoon, D., Meng, H., Shawe-taylor, J.S., Szedmak, S.: Two view learning: Svm-2k, theory and practice. In: Advances in Neural Information Processing Systems, pp. 355–362 (2005)
Zhang, D., He, J., Liu, Y., Si, L., Lawrence, R.D.: Multi-view transfer learning with a large margin approach. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1208–1216 (2011)
Qi, Z., Yang, M., Zhang, Z.M., Zhang, Z.: Mining noisy tagging from multi-label space. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1925–1929. ACM (2012)
Vapnik, V., Vashist, A., Pavlovitch, N.: Learning using hidden information (learning with teacher). In: International Joint Conference on Neural Networks, IJCNN 2009, pp. 3188–3195. IEEE (2009)
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73, 243–272 (2008)
Tenenhaus, A., Tenenhaus, M.: Regularized generalized canonical correlation analysis. Psychometrika 76, 257–284 (2011)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16, 2639–2664 (2004)
Kulis, B., Sustik, M., Dhillon, I.: Learning low-rank kernel matrices. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 505–512. ACM (2006)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)
Chen, L., Li, W., Xu, D.: Recognizing rgb images by learning from rgb-d data. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2014)
Shrivastava, A., Gupta, A.: Building part-based object detectors via 3d geometry. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1745–1752. IEEE (2013)
Tommasi, T., Quadrianto, N., Caputo, B., Lampert, C.H.: Beyond dataset bias: multi-task unaligned shared knowledge transfer. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 1–15. Springer, Heidelberg (2013)
Globerson, A., Roweis, S.: Nightmare at test time: robust learning by feature deletion. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 353–360. ACM (2006)
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. In: Advances in Neural Information Processing Systems, pp. 2222–2230 (2012)
Chen, J., Liu, X., Lyu, S.: Boosting with side information. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 563–577. Springer, Heidelberg (2013)
Shams, L., Wozny, D.R., Kim, R., Seitz, A.: Influences of multisensory experience on subsequent unisensory processing. Front. Psychol. 2, 264 (2011)
Kim, T.K., Kittler, J., Cipolla, R.: Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1005–1018 (2007)
Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2, 27 (2011)
Hotelling, H.: Relations between two sets of variates. Biometrika 28, 321–377 (1936)
Witten, D.M., Tibshirani, R., et al.: Extensions of sparse canonical correlation analysis with applications to genomic data. Stat. Appl. Genet. Mol. Biol. 8, 1–27 (2009)
Rupnik, J., Shawe-Taylor, J.: Multi-view canonical correlation analysis. In: Conference on Data Mining and Data Warehouses (SiKDD 2010), pp. 1–4 (2010)
Loog, M., van Ginneken, B., Duin, R.P.: Dimensionality reduction of image features using the canonical contextual correlation projection. Pattern Recogn. 38, 2409–2418 (2005)
Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 601–608. IEEE (2011)
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824. IEEE (2011)
Brown, M., Susstrunk, S.: Multi-spectral sift for scene category recognition. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 177–184. IEEE (2011)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE (2006)
Bo, L., Ren, X., Fox, D.: Unsupervised feature learning for rgb-d based object recognition. ISER, June 2012
Acknowledgement
Research reported in this publication was partly supported by the National Institute Of Nursing Research of the National Institutes of Health under Award Number R01NR015371. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This work is also partly supported by US National Science Foundation Grant IIS 1350763, China National Natural Science Foundation Grant 61228303, GH’s start-up funds form Stevens Institute of Technology, a Google Research Faculty Award, a gift grant from Microsoft Research, and a gift grant from NEC Labs America.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, Q., Hua, G., Liu, W., Liu, Z., Zhang, Z. (2015). Can Visual Recognition Benefit from Auxiliary Information in Training?. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-16865-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)