Elsevier

Knowledge-Based Systems

Volume 218, 22 April 2021, 106841
Knowledge-Based Systems

Consistency and diversity neural network multi-view multi-label learning

https://doi.org/10.1016/j.knosys.2021.106841Get rights and content

Highlights

  • A MVML feedforward neural network model is constructed, which is different from the traditional neural network model.

  • This model effectively considers the issues of view consistency, view diversity, and label correlations in MVML learning. In addition, a similar ensemble learning method is used to predict the final model.

  • The large number of empirical results of CDMM on the benchmark data set proves that it has certain advantages compared with some related and competitive methods.

Abstract

In multi-view multi-label learning, each object is represented by multiple heterogeneous data and is simultaneously associated with multiple class labels. Previous studies usually use shared subspaces to fuse multi-view representations. However, as the number of views increases, it is more challenging to capture the high-order relationships among multiple views. Therefore, a novel neural network multi-view multi-label learning framework is proposed, which is intended to solve the problem of consistency and diversity among views through a simple and effective method(CDMM). First, we build a separate classifier for each view based on the neural network method of the nonlinear kernel mapping function and require each view to learn a consistent label result. Then, we consider the diversity of individual views while learning a consistent representation among views. For this reason, we combine the Hilbert–Schmidt Independence Criterion with exploring the diversity among different views. Finally, the label correlation factor is in addition to the classification model, and the view contribution factor is added to the prediction model. A large number of comparative experiments with existing state-of-the-art solutions on benchmark multi-view multi-label learning data sets show the effectiveness of this method.

Introduction

Multi-label(ML) learning solves the problem of label ambiguity by associating a single instance with a set of labels. For example, a picture can be labeled as the “lake”, “mountain”, and “forest” at the same time, and there may be a strong correlation among these labels. The main task of ML learning is to build a learning model, which can effectively predict the possible label set of unknown objects. In the past few decades, scholars have proposed many effective ML learning algorithms in various domains, such as image annotation [1], [2], video annotation [3], and bioinformatics [4], [5] etc.

Traditional ML learning is to learn knowledge from a single data structure. However, in the real world, due to the increasing diversity of data collection and feature extraction methods, a single example has multiple views. This type of data is usually associated with numerous heterogeneous feature representations simultaneously, and each feature representation provides a different view of the data [6], [7], [8], [9]. For example, in image classification, natural scene images can usually be reproduced by visual features or described by a specific text. The main challenge of this type of task is how to effectively learn the heterogeneity among multiple views while accurately classifying data. Therefore, in the face of more complex data classification problems in real scenes, a multi-view multi-label(MVML) learning framework has emerged.

Due to the widespread existence of MVML datasets, MVML learning has become an active research area in many practical applications [10], [11]. In MVML learning, each instance is represented by multiple heterogeneous feature data, which is also associated with multiple class labels. Both solutions based on multi-view(MV) learning or ML learning have their fundamental problems to be resolved. The main issues to be solved urgently in the method that focuses on MV learning are:

  • 1.

    There should be consistent information representation among different views, so how to effectively solve the consistency problem among mining views and fusing the correlation between high-dimensional heterogeneous data is of paramount importance.

  • 2.

    Information observed among views is different, so there is individual diversity in the information obtained. Individual diversified information mining contributes to enhancing communication among views, thereby improving the performance of the algorithm.

  • 3.

    The structural differences of the data among the various views lead to different importance of each view, so the contribution degree of each view is also different.

Based on these problems, we divide these algorithms into two types. The first type of method is a two-step learning strategy. The first step directly uses the MV learning method to solve the MV learning problem. The second step utilizes the existing ML learning method to solve the ML learning problem. For example, Liu et al. [12] proposed a multi-view framework lrMMC based on matrix decomposition. The framework first seeks the shared representation of multiple views and then completes the classification based on the matrix of the shared feature space. Furthermore, [11] maps each view to a shared space to eliminate noise and redundancy while maintaining the sparse and manifold structure of the image data, respectively. This two-step learning strategy learning often results in sub-optimal results.

The second type of method is the joint learning strategy. They established a unified, MVML learning model to solve the problem. For example, Zhao et al. [13] introduced a predictive reliability measurement method to select samples for sharing labels to share information with other opinions in a co-training manner; Luo et al. [14] jointly extract the consistency and specificity of heterogeneous features for subspace learning; Zhang et al. [15] proposed an MVML method based on matrix factorization, which uses the complementarity among different views to obtain a common semantic representation. The complexity of this type of method is high, but the performance of the model has been significantly improved.

The main problems that need to be solved urgently in the method that focuses on ML learning are:

  • 1.

    In the face of high-dimensional heterogeneous data, how to effectively predict the label set of unknown instances;

  • 2.

    Effectively fusion of information among each view and mining label correlation to improve the performance of the classifier.

Based on this, some scholars have proposed solutions. One type of method is tantamount to connect all the data views in series to one data view, and then use the ML learning method to solve it. However, this concatenation strategy has the following problems: it ignores the different physical interpretations of the view data features; the concatenation strategy causes the feature dimension of the data to be too large, and the model training will overfit when the training; the other type establishes an ML classification model for each MV heterogeneous data, and unify the results of these models to construct the final prediction model. For example, Ren et al. [16] fuse multiple views into a mixed feature matrix and use low-rank structure and manifold regularization to utilize global label correlation and local smoothness. Nevertheless, parallel strategy forces all views to output consistent results, ignoring the diversity among different views. Another common problem in MVML learning is that there is an individual difference in the degree of contribution among other views.

Based on the above analysis, it can be seen that the current MVML learning mainly faces the following challenges: (1) the problems of consistency, diversity, and contribution degree among views. (2) the problem of label correlation in ML learning. Some existing learning methods that adopt step-by-step strategies ignore the information exchange between MV and ML when solving MVML problems, so they often get sub-optimal results. However, the learning method that adopts a unified strategy will have excessive model complexity or insufficient consideration of critical issues. To effectively solve the current problems facing MVML learning, a consistency and diversity neural network multi-view multi-label learning method was proposed and named CDMM. First, we design a random single-hidden layer feedforward neural network(SLFN) to perform ML learning for each view to ensure the consistency of all views. Then, we use the Hilbert–Schmidt Independence Criterion(HSIC) [17], [18], [19] to induce diversity among different views to learn the diversity information represented by different views. Finally, in the classification, we combine the advantages of the proposed classifier to enrich the original label space with the idea of label dependence propagation; in the prediction, we make the final prediction according to the different effects of each view on the contribution of the MVML learning task.

The main contributions of the paper are as follows:

  • 1.

    An MVML feedforward neural network model is constructed, different from the traditional neural network model. The CDMM does not require iteration, it is efficient and straightforward, and its parts are closely integrated.

  • 2.

    CDMM has a unified framework to jointly study the view consistency and diversity issues in MVML learning, and at the same time, integrate the correlation of labels and different contributing factors of views into the classification and prediction models. In addition, a similar ensemble learning method is used to predict the final model.

  • 3.

    A large number of empirical results of CDMM on the benchmark data set proves that it has certain advantages compared with some related and competitive methods.

The remainder of this paper is organized as follows. In Section 2, the related work of MVML learning is briefly introduced. Section 3 introduces the technical details of CDMM. The results of comparative experiments and specific analyses are illustrated in Section 4. Finally, Section 5 concludes this paper.

Section snippets

Multi-label learning

The difference from traditional single-label learning tasks is that the goal of ML learning is to assign multiple class labels for a single instance, which has attracted the attention and research of a large number of scholars in different machine learning tasks. According to different types of label correlations used, the existing ML methods can usually be divided into three categories. First-order: Consider that each label has its unique attributes and ignores the correlation among labels,

Problem statement and notations

Let X=Xvv=1V represents a feature space dataset with v views, where Xv = x1,,xNTRN×d represents the entire feature space of the vth view with N samples. Y=y1,y2,,ymRN×m represents the corresponding label space, where yi1,1N×m is the label vector of xi, and m represents the number of labels. Table 1 summarizes the definitions of some notations used in this paper.

Label matrix reconstruction

Label correlation is a crucial factor to improve the performance of ML learning, so in this section, we embed the label

Datasets

To verify the effectiveness of the CDMM algorithm, we use a total of 7 benchmark MVML data sets for performance evaluation, which can be download from MULAN1 and LEAR.2 The details of the data sets are summarized in Table 2.

Comparing algorithms

To verify the performance of CDMM, we selected three state-of-the-art MVML learning algorithms and one incomplete view MVML weak-label learning algorithm

Conclusion

In this paper, we have studied how to mine the information of view-consistency and view-diversity in multi-view data to achieve effective multi-view multi-label classification. For this reason, a multi-view multi-label learning framework called CDMM is proposed. It uses a unified feedforward neural network model to find out the consistency and diversity among heterogeneous views and additionally considers label correlation factors and view contribution factors. The difference from previous

CRediT authorship contribution statement

Dawei Zhao: Conceptualization, Methodology, Software, Investigation, Data curation, Writing - original draft, Writing - review & editing. Qingwei Gao: Validation, Supervision, Project administration, Funding acquisition. Yixiang Lu: Visualization, Investigation. Dong Sun: Formal analysis, Validation. Yusheng Cheng: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research is supported by Key Laboratory of Intelligent Computing and Signal Processing, Ministry of Education (Anhui University), China (2020A003), and the Nature Science Foundation of Anhui (2008085MF183).

References (48)

  • ZhangJia et al.

    Towards a unified multi-source-based optimization framework for multi-label learning

    Appl. Soft Comput.

    (2019)
  • LuJie et al.

    Structural property-aware multilayer network embedding for latent factor analysis

    Pattern Recognit.

    (2018)
  • HuangGao et al.

    Trends in extreme learning machines: A review

    Neural Netw.

    (2015)
  • PaoYoh-Han et al.

    Learning and generalization characteristics of the random vector functional-link net

    Neurocomputing

    (1994)
  • ChengYusheng et al.

    Multi-label learning with kernel extreme learning machine autoencoder

    Knowl.-Based Syst.

    (2019)
  • WenS. et al.

    Multilabel image classification via feature/label co-projection

    IEEE Transactions on Systems, Man, and Cybernetics: Systems

    (2020)
  • KangFeng et al.

    Correlated label propagation with application to multi-label learning

  • ElisseeffAndré et al.

    A kernel method for multi-labelled classification

  • ZhangMinling et al.

    Multilabel neural networks with applications to functional genomics and text categorization

    IEEE Trans. Knowl. Data Eng.

    (2006)
  • Zesen Chen, Xuan Wu, Qingguo Chen, Yao Hu, Minling Zhang, Multi-view partial multi-label learning with graph-based...
  • ZhangY. et al.

    Multi-view multi-label learning with sparse feature selection for image annotation

    IEEE Transactions on Multimedia

    (2020)
  • ZhuXiaofeng et al.

    Block-row sparse multiview multilabel learning for image classification

    IEEE Trans. Cybern.

    (2015)
  • Meng Liu, Yong Luo, Dacheng Tao, Chao Xu, Yonggang Wen, Low-rank multi-view learning in matrix completion for...
  • Shirui Luo, Changqing Zhang, Wei Zhang, Xiaochun Cao, Consistent and specific multi-view subspace clustering, in:...
  • Cited by (0)

    View full text