Skip to main content
Log in

Deep cross-view autoencoder network for multi-view learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In many real-world applications, an increasing number of objects can be collected at varying viewpoints or by different sensors, which brings in the urgent demand for recognizing objects from distinct heterogeneous views. Although significant progress has been achieved recently, heterogeneous recognition (cross-view recognition) in multi-view learning is still challenging due to the complex correlations among views. Multi-view subspace learning is an effective solution, which attempts to obtain a common representation from downstream computations. Most previous methods are based on the idea of maximal correlation after feature extraction to establish the relationship among different views in a two-step manner, thus leading to performance deterioration. To overcome this drawback, in this paper, we propose a deep cross-view autoencoder network (DCVAE) that extracts the features of different views and establishes the correlation between views in one step to simultaneously handle view-specific, view-correlation, and consistency in a joint manner. Specifically, DCVAE contains self-reconstruction, newly designed cross-view reconstruction, and consistency constraint modules. Self-reconstruction ensures the view-specific, cross-view reconstruction transfers the information from one view to another view, and consistency constraint makes the representation of different views more consistent. The proposed model suffices to discover the complex correlation embedded in multi-view data and to integrate heterogeneous views into a latent common representation subspace. Furthermore, the 2D embeddings of the learned common representation subspace demonstrate the consistency constraint is valid and cross-view classification experiments verify the superior performance of DCVAE in the two-view scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Akaho S (2006) A kernel method for canonical correlation analysis. arXiv:cs/0609071

  2. Andrew G, Arora R, Livescu K, Bilmes J (2013) Deep Canonical Correlation Analysis. In: International Conference on Machine Learning (ICML), pp 2284–2292

  3. Bajorski P (2011) Canonical Correlation Analysis. Encyclopedia of Stat Behav Sci:241–259. https://doi.org/10.1002/9781118121955.ch8

  4. Bottou L (2012) Stochastic gradient descent tricks. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). https://doi.org/10.1007/978-3-642-35289-8_25, vol 7700 LECTU. Springer, pp 421–436

  5. Cao G, Iosifidis A, Chen K, Gabbouj M (2018) Generalized multi-view embedding for visual recognition and cross-modal retrieval. IEEE Trans Cybern 48(9):2542–2555. https://doi.org/10.1109/TCYB.2017.2742705, https://ieeexplore.ieee.org/document/8026149/

    Article  Google Scholar 

  6. Deng S, Xia W, Gao Q, Gao X (2021) Cross-view classification by joint adversarial learning and class-specificity distribution. Pattern Recogn 110:107633. https://doi.org/10.1016/j.patcog.2020.107633, https://linkinghub.elsevier.com/retrieve/pii/S0031320320304362

    Article  Google Scholar 

  7. Ding Z, Fu Y (2018) Robust multiview data analysis through collective low-rank subspace. IEEE Trans Neural Netw Learn Syst 29(5):1986–1997. https://doi.org/10.1109/TNNLS.2017.2690970, http://ieeexplore.ieee.org/document/7902214/

    Article  MathSciNet  Google Scholar 

  8. Guo Y, Ji J, Shi D, Ye Q, Xie H (2020) Multi-view feature learning for VHR remote sensing image classification. Multimed Tools Appl:1–13. https://doi.org/10.1007/s11042-020-08713-z

  9. Hu P, Peng D, Sang Y, Xiang Y (2019) Multi-view linear discriminant analysis network. IEEE Trans Image Process 28(11):5352–5365. https://doi.org/10.1109/TIP.2019.2913511, https://ieeexplore.ieee.org/document/8704986/

    Article  MathSciNet  Google Scholar 

  10. Kan M, Shan S, Zhang H, Lao S, Chen X (2016) Multi-View Discriminant Analysis. IEEE Trans Pattern Anal Mach Intell 38(1):188–194. https://doi.org/10.1109/TPAMI.2015.2435740, http://ieeexplore.ieee.org/document/7110624/

    Article  Google Scholar 

  11. Kingma D P, Ba J (2014) Adam: A Method for Stochastic Optimization. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. arXIv:1412.6980

  12. Kuehlkamp A, Pinto A, Rocha A, Bowyer K W, Czajka A (2019) Ensemble of multi-view learning classifiers for cross-domain iris presentation attack detection. IEEE Trans Inf Forensic Secur 14(6):1419–1431. https://doi.org/10.1109/TIFS.2018.2878542, https://ieeexplore.ieee.org/document/8513867/

    Article  Google Scholar 

  13. Li S Z, Lei Z, Meng Ao (2009) The HFB Face Database for Heterogeneous Face Biometrics research. In: 2009 IEEE computer society conference on computer vision and pattern recognition workshops. https://doi.org/0.1109/CVPRW.2009.5204149, https://ieeexplore.ieee.org/document/5204149/. IEEE, pp 1–8

  14. Li Y, Yang M, Zhang Z (2019) A survey of multi-view representation learning. IEEE Trans Knowl Data Eng 31(10):1863–1883. https://doi.org/10.1109/TKDE.2018.2872063, https://ieeexplore.ieee.org/document/8471216/

    Article  Google Scholar 

  15. Liu H, Han J, Nie F, Li X (2017) Balanced clustering with least square regression. In: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, vol 31, pp 2231–2237

  16. Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng A Y (2011) Multimodal deep learning. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pp 689–696

  17. Nie F, Cai G, Li J, Li X (2018) Auto-weighted multi-view learning for image clustering and semi-supervised classification. IEEE Trans Image Process 27(3):1501–1511. https://doi.org/10.1109/TIP.2017.2754939, http://ieeexplore.ieee.org/document/8047308/

    Article  MathSciNet  Google Scholar 

  18. Rupnik J, Shawe-taylor J, Rupnik J, Shawe-taylor J (2016) Multi-View Canonical Correlation Analysis Multi-View Canonical Correlation Analysis. In: Conference on data mining and data warehouses (SiKDD 2010)

  19. Shang R, Meng Y, Wang W, Shang F, Jiao L (2019) Local discriminative based sparse subspace learning for feature selection. Pattern Recogn 92:219–230. https://doi.org/10.1016/j.patcog.2019.03.026, https://linkinghub.elsevier.com/retrieve/pii/S0031320319301347

    Article  Google Scholar 

  20. Sharma A, Kumar A, Daume H, Jacobs D W (2012) Generalized multiview analysis: a discriminative latent space. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2012.6247923, http://ieeexplore.ieee.org/document/6247923/. IEEE, pp 2160–2167

  21. van der Maaten L (2009) A new benchmark dataset for handwritten character recognition. Technical Report. Tilburg University, The Netherlands, pp 2–5. http://www.tilburguniversity.edu/research/institutes-and-research-groups/ticc/research/technicalreports/TR2009002.pdf

  22. Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2625

    MATH  Google Scholar 

  23. Wang Q, Ding Z, Tao Z, Gao Q, Fu Y (2018) Partial multi-view clustering via consistent GAN. In: 2018 IEEE International Conference on Data Mining (ICDM). https://doi.org/10.1109/ICDM.2018.00174, https://ieeexplore.ieee.org/document/8594983/. IEEE, pp 1290–1295

  24. Wang W, Arora R, Livescu K, Bilmes J (2015) On deep multi-view representation learning. In: 32nd International Conference on Machine Learning, ICML 2015, vol 2, pp 1083–1092

  25. Wen J, Zhang Z, Xu Y, Zhang B, Fei L, Xie G-S (2020) CDIMC-net: Cognitive Deep Incomplete Multi-view Clustering Network. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence. https://doi.org/10.1145/3394171.3413807, https://www.ijcai.org/proceedings/2020/447. International Joint Conferences on Artificial Intelligence Organization, California, pp 3230–3236

  26. Yang S, Gao T, Wang J, Deng B, Lansdell B, Linares-Barranco B (2021) Efficient Spike-Driven Learning With Dendritic Event-Based Processing. Front Neurosci 15. https://doi.org/10.3389/fnins.2021.601109

  27. Yang S, Wang J, Deng B, Liu C, Li H, Fietkiewicz C, Loparo K A (2019) Real-time neuromorphic system for large-scale conductance-based spiking neural networks. IEEE Trans Cybern 49(7):2490–2503. https://doi.org/10.1109/TCYB.2018.2823730, https://ieeexplore.ieee.org/document/8341965/

    Article  Google Scholar 

  28. Yang S, Wang J, Hao X, Li H, Wei X, Deng B, Loparo K A (2021) BiCoSS: Toward Large-Scale Cognition Brain With Multigranular Neuromorphic Architecture. IEEE Trans Neural Netw Learn Syst:1–15. https://doi.org/10.1109/TNNLS.2020.3045492

  29. Yang S, Wang J, Zhang N, Deng B, Pang Y, Azghadi M R (2021) CerebelluMorphic: large-scale neuromorphic model and architecture for supervised motor learning. IEEE Trans Neural Netw Learn Syst:1–15. https://doi.org/10.1109/TNNLS.2021.3057070, https://ieeexplore.ieee.org/document/9361429/

  30. Yoshida K, Yoshimoto J, Doya K (2017) Sparse kernel canonical correlation analysis for discovery of nonlinear interactions in high-dimensional data. BMC Bioinform 18 (1):108. https://doi.org/10.1186/s12859-017-1543-x

    Article  Google Scholar 

  31. You X, Xu J, Yuan W, Jing X-Y, Tao D, Zhang T (2019) Multi-view common component discriminant analysis for cross-view classification. Pattern Recogn 92:37–51. https://doi.org/10.1016/j.patcog.2019.03.008, https://linkinghub.elsevier.com/retrieve/pii/S0031320319301074

    Article  Google Scholar 

  32. Zhang C, Liu Y, Fu H (2019) AE2-Nets: autoencoder in autoencoder networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2019.00268, https://ieeexplore.ieee.org/document/8953969/. IEEE, pp 2572–2580

  33. Zhang Y, Lu H (2018) Deep cross-modal projection learning for image-text matching. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 686–701

  34. Zhang Z, Zhong Z, Cui J, Fei L (2018) Learning robust latent subspace for discriminative regression. In: 2017 IEEE visual communications and image processing, VCIP 2017. https://doi.org/10.1109/VCIP.2017.8305137, http://ieeexplore.ieee.org/document/8305137/, pp 1–4

  35. Zhao J, Xie X, Xu X, Sun S (2017) Multi-view learning overview: recent progress and new challenges. Inf Fusion 38:43–54. https://doi.org/10.1016/j.inffus.2017.02.007, https://linkinghub.elsevier.com/retrieve/pii/S1566253516302032

    Article  Google Scholar 

  36. Zhao Y, You X, Yu S, Xu C, Yuan W, Jing X-Y, Zhang T, Tao D (2018) Multi-view manifold learning with locality alignment. Pattern Recogn 78:154–166. https://doi.org/10.1016/j.patcog.2018.01.012, https://linkinghub.elsevier.com/retrieve/pii/S0031320318300128

    Article  Google Scholar 

Download references

Acknowledgements

This work was sponsored by Scientific and Technological Research Program of Chongqing Municipal Education Commission (Grant No. KJQN202100638) and Natural Science Foundation of Chongqing (Grant No. cstc2018jcyjAX0532).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian-Xun Mi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mi, JX., Fu, CQ., Chen, T. et al. Deep cross-view autoencoder network for multi-view learning. Multimed Tools Appl 81, 24645–24664 (2022). https://doi.org/10.1007/s11042-022-12636-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12636-2

Keywords