Abstract
Canonical correlation analysis (CCA) and its nonlinear extensions have shown promising performance in multi-view representation learning. One of the most representative methods is Deep Canonical Correlation Analysis Autoencoders (DCCAE), which combines CCA with a reconstruction loss to reserve more information in representations. However, the contradiction between reconstruction and correlation maximization hinders the optimization of them. Here we propose a multi-view representation learning method named Full Reconstruction based Deep Canonical Correlation Analysis (FR-DCCA), which not only addresses this contradiction but also enables the reconstructing and correlation maximization to benefit from each other. In FR-DCCA, Split Encoder models the information in each view as shared information and specific information; CCA layer maintains consistency through maximizing the canonical correlation between shared information of views; Full Reconstruction module guarantees completeness and complementarity by reconstructing each view with both the shared and specific information. In FR-DCCA, reconstructing and correlation maximization mutually improve each other and yield complete, compact, and discriminative view representations. Experiments and analysis on multiple datasets demonstrate: 1) FR-DCCA significantly outperforms the comparison methods. 2) FR-DCCA effectively addresses the contradiction between reconstruction and correlation maximization. To facilitate future research, we release the codes at https://github.com/FR-DCCA/FR-DCCA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Akaho, S.: A kernel method for canonical correlation analysis. In: Proceedings of the International Meeting of the Psychometric Society, pp. 2639–2664 (2001)
Andrew, G., Arora, R., Bilmes, J., Livescu, K.: Deep canonical correlation analysis. In: Proceedings of the 30th International Conference on Machine Learning, pp. 1247–1255 (2013)
Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2019). https://doi.org/10.1109/TPAMI.2018.2798607
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100 (1998). https://doi.org/10.1145/279943.279962
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011). https://doi.org/10.1145/1961189.1961199
Chang, X., Dacheng, T., Chao, X.: A survey on multi-view learning (2013). https://doi.org/10.1109/TKDE.2018.2872063
Fuglede, B., Topsoe, F.: Jensen-Shannon divergence and Hilbert space embedding. In: Proceedings of International Symposium on Information Theory, 2004, pp. 31- (2004). https://doi.org/10.1109/ISIT.2004.1365067
Gao, Q., Lian, H., Wang, Q., Sun, G.: Cross-modal subspace clustering via deep canonical correlation analysis. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 3938–3945 (2020). https://doi.org/10.1609/aaai.v34i04.5808
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004). https://doi.org/10.1162/0899766042321814
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313, 454–455 (2013). https://doi.org/10.1126/science.1127647
Schölkopf, B.: Statistical learning and kernel methods. In: Della Riccia, G., Lenz, H.-J., Kruse, R. (eds.) Data Fusion and Perception. ICMS, vol. 431, pp. 3–24. Springer, Vienna (2001). https://doi.org/10.1007/978-3-7091-2580-9_1
Jia, X., et al.: Semi-supervised multi-view deep discriminant representation learning. IEEE Trans. Pattern Anal. Mach. Intell. 43(7), 2496–2509 (2021). https://doi.org/10.1109/TPAMI.2020.2973634
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR 2015) (2010)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Li, Y., Yang, M., Zhang, Z.: A survey of multi-view representation learning. IEEE Trans. Knowl. Data Eng. 31, 1863–1883 (2018). https://doi.org/10.1109/TKDE.2018.2872063
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, pp. 849–856 (2001)
Pei, H., Wei, B., Chang, K.C.C., Lei, Y., Yang, B.: Geom-GCN: Geometric graph convolutional networks. In: International Conference on Learning Representations (2020)
Sun, S.: A survey of multi-view machine learning. Neural Comput. Appl. 23, 2031–2038 (2013). https://doi.org/10.1007/s00521-013-1362-6
Wan, Z., Zhang, C., Zhu, P.F., Hu, Q.: Multi-view information-bottleneck representation learning. In: AAAI (2021)
Wang, W., Arora, R., Livescu, K., Bilmes, J.: On deep multi-view representation learning. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 1083–1092 (2015)
Wang, W., Arora, R., Livescu, K., Bilmes, J.: Unsupervised learning of acoustic features via deep canonical correlation analysis. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 4590–4594 (2015). https://doi.org/10.1109/ICASSP.2015.7178840
Wang, Z., Bovik, A.C.: Mean squared error: love it or leave it? a new look at signal fidelity measures. IEEE Signal Process. Mag. 26(1), 98–117 (2009). https://doi.org/10.1109/MSP.2008.930649
Yu, Y., Tang, S., Aizawa, K., Aizawa, A.: Category-based deep CCA for fine-grained venue discovery from multimodal data. IEEE Trans. Neural Netw. Learn. Syst. 30(4), 1250–1258 (2019). https://doi.org/10.1109/TNNLS.2018.2856253
Zhang, C., Cui, Y., Han, Z., Zhou, J.T., Fu, H., Hu, Q.: Deep partial multi-view learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, p. 1 (2020). https://doi.org/10.1109/TPAMI.2020.3037734
Zhou, T., Zhang, C., Peng, X., Bhaskar, H., Yang, J.: Dual shared-specific multiview subspace clustering. IEEE Trans. Cybern. 50(8), 3517–3530 (2020). https://doi.org/10.1109/TCYB.2019.2918495
Zhu, W., Lu, J., Zhou, J.: Structured general and specific multi-view subspace clustering. Pattern Recogn. 93, 392–403 (2019). https://doi.org/10.1016/j.patcog.2019.05.005
Acknowledgement
This work was supported by the NSFC Project under Grant No. 62176069 and 61933013, the Innovation Group of Guangdong Education Department under Grant No. 2020KCXTD014 and the 2019 Key Discipline project of Guangdong Province.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sun, Q., Jia, X., Jing, XY. (2022). Addressing Contradiction Between Reconstruction and Correlation Maximization in Deep Canonical Correlation Autoencoders. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13529. Springer, Cham. https://doi.org/10.1007/978-3-031-15919-0_58
Download citation
DOI: https://doi.org/10.1007/978-3-031-15919-0_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15918-3
Online ISBN: 978-3-031-15919-0
eBook Packages: Computer ScienceComputer Science (R0)