Generative View-Correlation Adaptation for Semi-supervised Multi-view Learning

Liu, Yunyu; Wang, Lichen; Bai, Yue; Qin, Can; Ding, Zhengming; Fu, Yun

doi:10.1007/978-3-030-58568-6_19

Yunyu Liu¹²,
Lichen Wang¹²,
Yue Bai¹²,
Can Qin¹²,
Zhengming Ding¹³ &
…
Yun Fu¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12359))

Included in the following conference series:

European Conference on Computer Vision

4090 Accesses
9 Citations

Abstract

Multi-view learning (MVL) explores the data extracted from multiple resources. It assumes that the complementary information between different views could be revealed to further improve the learning performance. There are two challenges. First, it is difficult to effectively combine the different view data while still fully preserve the view-specific information. Second, multi-view datasets are usually small, which means the model can be easily overfitted. To address the challenges, we propose a novel View-Correlation Adaptation (VCA) framework in semi-supervised fashion. A semi-supervised data augmentation me-thod is designed to generate extra features and labels based on both labeled and unlabeled samples. In addition, a cross-view adversarial training strategy is proposed to explore the structural information from one view and help the representation learning of the other view. Moreover, an effective and simple fusion network is proposed for the late fusion stage. In our model, all networks are jointly trained in an end-to-end fashion. Extensive experiments demonstrate that our approach is effective and stable compared with other state-of-the-art methods (Code is available on: https://github.com/wenwen0319/GVCA).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Azad, R., Asadi-Aghbolaghi, M., Kasaei, S., Escalera, S.: Dynamic 3D hand gesture recognition by learning weighted depth motion maps. IEEE Trans. Circuits Syst. Video Technol. 29, 1729–1740 (2018)
Article Google Scholar
Banica, D., Sminchisescu, C.: Second-order constrained parametric proposals and sequential search-based structured prediction for semantic segmentation in RGB-D images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3517–3526 (2015)
Google Scholar
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: MixMatch: a holistic approach to semi-supervised learning. arXiv preprint arXiv:1905.02249 (2019)
Cai, Z., Wang, L., Peng, X., Qiao, Y.: Multi-view super vector for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 596–603 (2014)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)
Article Google Scholar
Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised learning. IEEE Trans. Neural Netw. 20(3), 542 (2009)
Article Google Scholar
Cheng, Y., Zhao, X., Cai, R., Li, Z., Huang, K., Rui, Y., et al.: Semi-supervised multimodal deep learning for RGB-D object recognition (2016)
Google Scholar
Cheng, Z., Qin, L., Ye, Y., Huang, Q., Tian, Q.: Human daily action analysis with multi-view and color-depth data. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7584, pp. 52–61. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33868-7_6
Chapter Google Scholar
Ding, Z., Shao, M., Fu, Y.: Robust multi-view representation: a unified perspective from multi-view learning to domain adaption. In: Proceedings of the International Joint Conferences on Artificial Intelligence, pp. 5434–5440 (2018)
Google Scholar
Du, D., Wang, L., Wang, H., Zhao, K., Wu, G.: Translate-to-recognize networks for RGB-D scene recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11836–11845 (2019)
Google Scholar
Girdhar, R., Ramanan, D., Gupta, A., Sivic, J., Russell, B.: ActionVLAD: learning spatio-temporal aggregation for action classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, p. 3 (2017)
Google Scholar
Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2827–2836 (2016)
Google Scholar
Holte, M.B., Moeslund, T.B., Nikolaidis, N., Pitas, I.: 3D human action recognition for multi-view camera systems. In: Proceedings of the International conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, pp. 342–349 (2011)
Google Scholar
Ji, X., Wang, C., Li, Y.: A view-invariant action recognition based on multi-view space hidden Markov models. Int. J. Hum. Robot. 11(01), 1450011 (2014)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, Y., Zhang, J., Cheng, Y., Huang, K., Tan, T.: DF2Net: discriminative feature learning and fusion network for RGB-D indoor scene classification. In: Proceedings of AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Lin, Y.C., Hu, M.C., Cheng, W.H., Hsieh, Y.H., Chen, H.M.: Human action recognition and retrieval using sole depth information. In: Proceedings of the ACM International Conference on Multimedia, pp. 1053–1056 (2012)
Google Scholar
Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Nie, F., Cai, G., Li, X.: Multi-view clustering and semi-supervised classification with adaptive neighbours. In: Proceedings of AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Nie, F., Li, J., Li, X., et al.: Parameter-free auto-weighted multiple graph learning: a framework for multiview clustering and semi-supervised classification. In: Proceedings of International Joint Conferences on Artificial Intelligence, pp. 1881–1887 (2016)
Google Scholar
Nie, F., Tian, L., Wang, R., Li, X.: Multiview semi-supervised learning model for image classification. IEEE Trans. Knowl. Data Eng. (2019)
Google Scholar
Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Berkeley MHAD: a comprehensive multimodal human action database. In: IEEE Workshop on Applications of Computer Vision, pp. 53–60 (2013)
Google Scholar
Pagliari, D., Pinto, L.: Calibration of Kinect for Xbox one and comparison between the two generations of Microsoft sensors. Sensors 15, 27569–27589 (2015)
Article Google Scholar
Rahmani, H., Mahmood, A., Huynh, D., Mian, A.: Histogram of oriented principal components for cross-view action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38(12), 2430–2443 (2016)
Article Google Scholar
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Verma, V., Lamb, A., Beckham, C., Courville, A., Mitliagkis, I., Bengio, Y.: Manifold mixup: encouraging meaningful on-manifold interpolation as a regularizer. stat 1050, vol. 13 (2018)
Google Scholar
Wang, A., Cai, J., Lu, J., Cham, T.J.: Modality and component aware feature fusion for RGB-D scene classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5995–6004 (2016)
Google Scholar
Wang, D., Ouyang, W., Li, W., Xu, D.: Dividing and aggregating network for multi-view action recognition. In: Proceedings of European Conference on Computer Vision (September 2018)
Google Scholar
Wang, L., Ding, Z., Fu, Y.: Learning transferable subspace for human motion segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Wang, L., Ding, Z., Fu, Y.: Low-rank transfer human motion segmentation. IEEE Trans. Image Process. 28(2), 1023–1034 (2019)
Article MathSciNet Google Scholar
Wang, L., Ding, Z., Tao, Z., Liu, Y., Fu, Y.: Generative multi-view human action recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6212–6221 (2019)
Google Scholar
Wang, L., Liu, Y., Qin, C., Sun, G., Fu, Y.: Dual relation semi-supervised multi-label learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (2020)
Google Scholar
Wang, L., Sun, B., Robinson, J., Jing, T., Fu, Y.: EV-Action: electromyography-vision multi-modal action dataset. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (2020)
Google Scholar
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Proceedings of European Conference on Machine Learning, pp. 20–36 (2016)
Google Scholar
Wang, W., Zhou, Z.-H.: Analyzing co-training style algorithms. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 454–465. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_42
Chapter Google Scholar
Yang, Y., Zhan, D.C., Sheng, X.R., Jiang, Y.: Semi-supervised multi-modal learning with incomplete modalities. In: Proceedings of International Joint Conferences on Artificial Intelligence, pp. 2998–3004 (2018)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: Proceedings of International Conference on Learning Representations (2018)
Google Scholar
Zhang, Z.: Microsoft Kinect sensor and its effect. IEEE Multimed. 19(2), 4–10 (2012)
Article Google Scholar

Download references

Acknowledgement

This research is supported by the U.S. Army Research Office Award W911NF-17-1-0367.

Author information

Authors and Affiliations

Northeastern University, Boston, MA, USA
Yunyu Liu, Lichen Wang, Yue Bai, Can Qin & Yun Fu
Indiana University-Purdue University Indianapolis, Indianapolis, IN, USA
Zhengming Ding

Authors

Yunyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lichen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yue Bai
View author publications
You can also search for this author in PubMed Google Scholar
Can Qin
View author publications
You can also search for this author in PubMed Google Scholar
Zhengming Ding
View author publications
You can also search for this author in PubMed Google Scholar
Yun Fu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunyu Liu .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Wang, L., Bai, Y., Qin, C., Ding, Z., Fu, Y. (2020). Generative View-Correlation Adaptation for Semi-supervised Multi-view Learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12359. Springer, Cham. https://doi.org/10.1007/978-3-030-58568-6_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-58568-6_19
Published: 13 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58567-9
Online ISBN: 978-3-030-58568-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics