Elsevier

Information Sciences

Volume 613, October 2022, Pages 731-746
Information Sciences

Multi-view representation learning for data stream clustering

https://doi.org/10.1016/j.ins.2022.09.045Get rights and content
Under a Creative Commons license
open access

Highlights

  • The MVRL method learns a fused sparse affinity matrix across multiple views.

  • The MVRL method captures the global and local structures of data objects.

  • The complementary information is explored by exploiting affinity matrices.

  • The upper bound of computational cost is determined by closed-form solutions.

  • The dynamic set transfers previously learned knowledge to the arrival data objects.

Abstract

Data stream clustering provides valuable insights into the evolving patterns of long sequences of continuously generated data objects. Most existing clustering methods focus on single-view data streams. In this paper, we propose a multi-view representation learning (MVRL) method for multi-view clustering of data streams. We first introduce an integrated representation learning model to learn a fused sparse affinity matrix across multiple views for spectral clustering. Motivated by the optimization procedure of the integrated representation learning model, we propose three consecutive stages: collaborative representation, the construction of individual global affinity matrices using a mapping function, and the calculation of a fused sparse affinity matrix using Euclidean projection. These stages allow the effective capture of the global and local structures of high-dimensional data objects. Moreover, each stage has a closed-form solution, which determines the upper bound of the computational cost and memory consumption. We then employ the construction residuals of the collaborative representation to adaptively update a dynamic set, which is used to preserve the representative data objects. The dynamic set efficiently transfers previously learned useful knowledge to the arriving data objects. Extensive experimental results on multi-view data stream datasets demonstrate the effectiveness of the proposed MVRL method.

Keywords

Data stream clustering
Representation learning
Multi-view data
High-dimensional data

Cited by (0)

This work was supported in part by National Key Project under Grant GJXM92579, in part by National Natural Science Foundation of China (NSFC) under Grant 61303015, and in part by Sichuan Science and Technology Program under Grant 2021YJ0078.