Elsevier

Future Generation Computer Systems

Volume 100, November 2019, Pages 523-530
Future Generation Computer Systems

Unsupervised multi-view non-negative for law data feature learning with dual graph-regularization in smart Internet of Things

https://doi.org/10.1016/j.future.2019.05.055Get rights and content

Highlights

  • Utilize graph dual regularization to simulate data and feature manifold for law data.

  • Challenges to reduce the effect of uncorrelated data with view-specific features.

  • Distinguish the influence of each view in the latent space.

  • Constrain the sparsity of the common subspace via 1,2− norm.

Abstract

In the real world, the law data in the smart Internet of Things usually consists of heterogeneous information with some noises. Non-negative matrix factorization is a popular tool for multi-view learning, which can be employed to represent and learn heterogeneous law features comprehensively. However, current NMF-based techniques generally use clean multi-view datasets to generate common subspace, while in practice, they often contain some noises or unrelated items so that the performance of the algorithms may be severely degraded. In this paper, we propose to develop a novel subspace learning model, called Adaptive Dual Graph-regularized Multi-View Non-Negative Feature Learning (ADMFL), for multi-view data representation. We utilize the geometric structures of both data and feature manifold to model the distribution of data points in the common subspace. Meanwhile, we lift the effect of unrelated features down through separating the view-specific features for each view. Moreover, we introduce a weight factor for all views and maintain the sparsity of the latent common representation. An effective objective function is thus designed and iteratively updated until convergence. Experiments on standard datasets demonstrate that the proposed ADMFL method outperforms other compared methods in the paper.

Introduction

In the big data era, a variety of law data in the smart Internet of Things is being collected. It often has the characteristics of heterogeneity, large, and high noise [1], [2], [3], which becomes an urgent matter of how to acquire key information and how to make a correlation among these massive data. Multi-view learning is one of the promising ways, in which different views can describe their essential characteristics in various dimensions, respectively. Thus, through learning the common subspace expressed by these essential features, it can acquire key information from the massive data or build the bridge among various views to filter out the effect of uncorrelated information. Recently, varieties of multi-view learning algorithms [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17] have been proposed and demonstrated their effectiveness on law data.

Non-negative matrix factorization is an effective way to obtain a part-based common subspace in multi-view learning. It aims to integrate parts into a whole that provides a good approximation to the original data space. Following this idea, Multi-NMF [6] is developed to generate a common representation by formulating the joint matrix factorization. And then, researchers derive its variants [5], [8], which provide other novel ideas, to get better latent subspace features. Unfortunately, there are some drawbacks on these methods. One of them is that they cannot preserve the local geometric structure of the original data [18]. To tackle it, Cal et al. [18] proposed a non-negative matrix factorization with graph regularization which integrated the manifold regularization with traditional NMF for single-view data learning. In [9], they extended this method into multi-view data. Furthermore, Cao et al. [14] utilized subspaces with the diversity constraints to reinforce the complementarity between different views. Since different views usually have various manifestations, some researchers developed new methods [19], [20] by introducing two metric spaces to analyze the relationship among objects in the shared representation for reducing noise interference in single view learning.

However, in the real-world applications, noises are unavoidable and they affect the accuracy of data representation. Inspired by robust principal component analysis [21], some researchers [12], [22], [23] approximated the noisy information by introducing an error matrix into traditional NMF. Moreover, some researchers employed feature selection techniques to locate related features, which could effectively reduce the noisy features in different views [10], [16], [17]. For example, Zhao et al. [10] proposed to remove the effect of unrelated items and obtained promising results by separating the view-specific features from the shared feature representation.

Motivated by the existing correlated feature learning and dual graph regularized models, we design a novel NMF based model, namely Adaptive Dual Graph-regularized Multi-View Non-Negative Feature Learning (ADMFL), in this paper. More specifically, we focus on graph dual regularization and parameter adjustment in multi-view subspace learning. Its functionality mainly rests on three pillars. Firstly, inspired by the recent progress of graph dual regularization, the geometric structures of both data and feature manifold are embedded to learn a comprehensive representation. Secondly, to improve the view adaptability of ADMFL, the influence of the weight parameter for each view is considered and updated in a self-adaptive way. Finally, a new cost function is defined and optimized iteratively with the nonnegative constraint and sparse constraint for each view via latent space sharing. The major contributions of this work are:

  • Through the combination of graph dual regularization and view-specific features, the effect of the uncorrelated items can be reduced in each view while the distribution of objects is able to be simulated in the data manifold and feature manifold.

  • A weight factor is employed to distinguish the influence of each view in the latent subspace. Meanwhile, 1,2− norm is utilized to constrain the sparsity of the common subspace.

  • An effective algorithm is derived to solve the above problem and the optimization based on loss function is tackled with reliable convergence.

The rest of this paper is organized as follows. Some related works are introduced briefly in Section 2. In Section 3, the Adaptive Dual Graph-regularized Multi-View Non-Negative Feature Learning (ADMFL) method is developed and the corresponding optimization processes are presented. Experimental results, which are shown in Section 4, demonstrate that our proposed ADMFL model is superior to other multi-view learning methods on real-world datasets. Finally, the paper is concluded in Section 5.

Section snippets

Related work

Nonnegative Matrix Factorization (NMF) is a favorable research direction in the matrix factorization region, which only pays attention to matrices without negative features. It divides the original matrix into two matrices with a lower dimension by a linear transformation to reconstruct each sample. Following this idea, Cai et al. [18] found that the local geometrical structure of the data manifold could describe the data distribution in original features and be projected to latent common

Adaptive Dual Graph-regularized Multi-View Non-Negative for law data Feature Learning

In this section, we illustrate the workflow of ADMFL as shown in Fig. 1. For given data items, various features are obtained to construct a multi-view dataset D=Xvv=1K with K views and N instances. XvR+Mv×N denotes the feature matrix belonging to the vth view with Mv- dimensional features, where R+ is data matrices with non-negative constraint. XF stands for the Frobenius norm of matrix X throughout this paper. Considering both data space and feature space, it exploits their local

Experiments

Extensive experiments are designed to demonstrate the effectiveness of our ADMFL model on multi-view datasets.

Conclusion

In this paper, we propose a subspace learning model, called ADMFL, for law data learning. Inspired by the recent progress in dual regularization, it explores geometric structures of both data and feature manifold and learns view-specific features to generate a comprehensive representation. Moreover, ADMFL introduces the self-adaptive weight factor for each view and maintains the sparsity of latent common representation. Thus, a new objective function is defined, and the corresponding

Declaration of competing interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgment

This paper is supported by the National Key Research and Development Program of China (Grants No. 2018YFC0830203).

Xiru Qiu received the B.S. degree in Software Engineering from the Dalian University of Technology in 2015. Now, she is a Master student at the Dalian University of Technology. Her interests include data mining and transfer learning.

References (34)

  • ZhaoL. et al.

    ICFS clustering with multiple representatives for large data

    IEEE Trans. Neural Netw. Learn. Syst.

    (2019)
  • KalayehM.M. et al.

    Nmf-Knn: Image annotation using weighted multi-view non-negative matrix factorization

  • LiuJ. et al.

    Multi-view clustering via joint nonnegative matrix factorization

  • GuanZ. et al.

    Multi-view concept learning for data representation

    IEEE Trans. Knowl. Data Eng.

    (2015)
  • WangJ. et al.

    Adaptive multi-view semi-supervised nonnegative matrix factorization

  • D. Hidru, A. Goldenberg, EquiNMF: Graph regularized multiview nonnegative matrix factorization, arXiv preprint 2014,...
  • CaiX. et al.

    Multi-view k-means clustering on big data

  • Cited by (18)

    • A Novel Multi-view Bi-clustering method for identifying abnormal Co-occurrence medical visit behaviors

      2022, Methods
      Citation Excerpt :

      Multi-view data are applied in numerous scientific fields, such as the identification of COVID-19 viruses using multi-view features [16], the identification of conserved co-coordinated genes [17], and medical image segmentation analysis [18]. Existing multi-view data analysis includes supervised [19], semi-supervised co-training [20,21] and unsupervised co-training [22] as well as multi-view feature learning [23–25], where all samples are presented in multiple ways to create multiple sets of input variables. A multi-view spectral clustering framework is proposed by Kumar et al. [26], which achieves this goal by co-regularizing the clustering assumptions, allowing access to multiple views of the data.

    • Retargeted multi-view classification via structured sparse learning

      2022, Signal Processing
      Citation Excerpt :

      Cai et al. [58] proposed a partial multi-view spectral clustering method to cluster partial multi-view data directly by adopting the non-negative and orthogonal constraints to enhance its robustness and efficiency. Qiu et al. [59] presented an adaptive dual graph-regularized multi-view non-negative feature learning method for multi-view data representation. It explored geometric structures between data and feature manifold and learned view-specific features to reduce the effect of irrelevant features.

    • Robust multi-view non-negative matrix factorization for clustering

      2022, Digital Signal Processing: A Review Journal
      Citation Excerpt :

      Our proposed RMNMF method falls into the category of multi-view subspace learning. Over the past decade, the NMF based multi-view methods have attracted extensive attention [38–41,25,42–44]. For example, in [19], Liu et al. present a joint NMF based multi-view clustering (MultiNMF) approach, which introduces the concept of multi-view learning into the NMF framework.

    • Privacy information verification of homomorphic algorithm for aggregated data based on fog layer structure

      2022, Computer Communications
      Citation Excerpt :

      Many schemes use homomorphic encryption to achieve secure data aggregation. Lu et al. [4] designed an efficient and privacy-protected aggregation scheme in the smart grid. The scheme uses a super-increasing sequence to integrate multi-dimensional data into a one-dimensional form and then uses the Paillier algorithm to aggregate the encrypted data, reducing the significant increase in communication efficiency and better meeting the real-time communication requirements.

    • A novel multi-view clustering approach via proximity-based factorization targeting structural maintenance and sparsity challenges for text and image categorization

      2021, Information Processing and Management
      Citation Excerpt :

      Another work in this direction Wang, Kong et al. (2015) merged the local geometric structure information of each view in NMF for feature extraction. Similarly, (Qiu et al., 2019; Zhang, Gao et al., 2018) employed graph regularization in multi-view data space for feature learning and object recognition respectively. Another direction in multi-view clustering is to develop datasets for algorithmic analysis and research purposes, such as multi-domain and multi-modality event dataset of news articles and images has been introduced by Yang et al. in (Yang et al., 2020).

    View all citing articles on Scopus

    Xiru Qiu received the B.S. degree in Software Engineering from the Dalian University of Technology in 2015. Now, she is a Master student at the Dalian University of Technology. Her interests include data mining and transfer learning.

    Zhikui Chen received the B.E. degree in mathematics from Chongqing Normal University, Chongqing, China, in 1990, and the Ph.D. degree in solid mechanics from Chongqing University, Chongqing, China, in 1998. He is currently a Professor at Dalian University of Technology, Dalian, China. His research interests include Internet of Things and big data.

    Liang Zhao received the Ph.D. and M.S. degrees from the Dalian University of Technology, Dalian, China, in 2018 and 2014, respectively. He is currently a Teacher with the School of Software Technology, Dalian University of Technology. His current research interests include big data and artificial intelligence.

    Chengsheng Hu, received the M.S. degree in Beijing university of chemical technology in 2016. Now he is working as an artificial intelligence engineer in Beijing Thunisoft information technology co. LTD, mainly studying machine learning algorithm and deep learning algorithm.

    View full text