Node proximity preserved dynamic network embedding via matrix perturbation
Introduction
Networks are ubiquitous in real life, through which the relationships among objects in complex systems can be expressed effectively. With the popularization of networks, an increasing number of media like Blog, Flickr and more are inclined to take the format of a network as the information carrier. However, the size of real-world networks can be extremely large and the connection structure in a network is also complicated. By virtue of these ingredients, the computational costs are usually high, which remains an obstacle when dealing with information in real-world networks. Thus, it is worthwhile devoting much effort to route up an effective way to analyze network data. One paradigm to overcome this obstacle is network embedding, which represents each node of a given network as dense real-valued vectors in a low-dimensional latent space while preserving essential structural properties (i.e. edges and other high-order proximities [1], [2]) in the network. The learned embedding vectors can benefit a variety of practical applications [3], [4], such as multi-label classification [5], [6], [7], [8], link prediction [9], [10], and so on.
Supplies of excellent network embedding methods are proposed recently. Perozzi et al. [11] first introduced the skip-gram model with negative sampling [12] and employed a truncated random walk to learn latent node representations. Tang et al. [13] optimized a carefully designed objective function that can preserve both the first-order and second-order proximities. Keikha et al. [14] introduced an algorithm named “CARE” which can be used for different types of networks including weighted, directed and complex. Besides, Cao et al. [15] tried to model the global information of given networks by applying SVD and achieved good performance. Furthermore, Yang et al. [16] revealed that several neural network-based algorithms are actually executing matrix factorization. Goyal et al. [17] provided a comprehensive and structured analysis of multifarious embedding algorithms proposed in this literature. Xue et al. [18] proposed CDNR, which integrates a cross-domain two-layer node-scale balance algorithm and a cross-domain two layer knowledge transfer algorithm to achieve cross-domain transfer in random walks and enable a network representation for the scarcely structured networks as well. Huang et al. [19] proposed a model named AMVAE (Attention-based Multi-view Variational Auto-Encoder) to fuse both links and the multi-modal contents for network embedding. However, networks are usually dynamic in the real world and edges among vertices in given networks are evolving continuously over time. For example, users in a social network may establish new connections to each other, while interactions among proteins vary over time in a biological network. Most existing embedding methods focus only on static networks and ignore changes in the given networks. To address this issue, a collection of algorithms for dynamic network embedding was presented. Zhou et al. [20] proposed a triadic unit, namely dynamicTriad, which models how the triadic unit closed from an open one, thus preserving both structural information and the evolution of given networks. Nguen et al. [21] incorporated temporal dependencies into node embedding and deep graph models thereby integrating temporal information of dynamic networks. In the document [22], the evolving patterns in given networks are preserved by using evolving random walks and initializing the current embedding vectors with previous embedding vectors. Ahmed et al. [23] presented a method based on non-negative matrix factorization (NMF) to learn latent features from the temporal of the given dynamic network. Cui et al. [24] proposed a generalized eigen perturbation to incrementally incorporate the changes of given networks.
For the given networks, most of the works mentioned above only pay attention to the static properties while ignoring the dynamic properties. In the real world, however, edges between two nodes in the given networks usually evolve over time. Most networks, such as social networks and protein networks, usually add or delete the connection between two nodes in the given networks over time. How to efficiently update the newly added/deleted edges raises a challenge to these works about static network embedding. Although many researchers make some efforts on dynamic network embedding, they still fail to retain proximities among nodes, which are proved to be crucial structural properties in the given networks. Therefore, it is quite necessary to explore how to preserve network dynamics in network representations; and it still remains open in the literature about how to incorporate the evolution of dynamic networks while preserving the node proximities simultaneously.
As mentioned before, most of the previous embedding methods cannot preserve the evolving characteristics well, which are proved an essential property of real-world networks. Motivated by this, we endeavor to design a dynamic network embedding model that can dynamically update low-dimensional node representations through characterizing the increment of the adjacency matrix while preserving node proximities. Specifically, we first review the DeepWalk algorithm [11] and theoretically elaborate that DeepWalk is equivalent to implicitly factorizing a matrix, and the node embedding vectors can be obtained by utilizing generalized SVD. Additionally, we capture node proximities of different orders based on the eigen-decomposition reweighing theorem and update the embedding vectors to capture the evolution of given networks via generalized eigen perturbation. To summarize, our contributions can be organized as follows:
- •
We present a novel model of dynamic network embedding that preserves both node proximities and the changes of dynamic networks based on matrix factorization.
- •
We perform our model by using generalized SVD as well as the eigen-decomposition reweighing theorem. Additionally, we employ generalized eigen perturbation to update node representations.
- •
Extensive experiments on several real-world networks in various applications demonstrate that our model outperforms state-of-the-art baseline models.
The remainder of this paper is organized as follows. In Section 2, we review the related works including neural network-based methods and matrix factorization based methods. In Section 3, we formally formulate our method and introduce it in detail, simultaneously. Datasets are exhibited in Section 4, and we validate the proposed model on these datasets. Finally, the conclusions of this paper are presented in Section 5.
Section snippets
Related works
Recently, network embedding, which endeavors to learn a low-dimensional vector for each node of given networks, exerts a tremendous fascination. From the perspective of basic techniques, network embedding algorithms can be divided into two broad categories: neural network based methods and matrix factorization based methods. Neural network based methods have made distinguished achievements in some practical applications, such as community detection, recommended system and so on. Different from
Methodology
We introduce our model for dynamic network embedding in this section. Particularly, we describe the problem definition and notations of our model in Section 3.1 and elaborate on the NPDNE model in Section 3.2.
Experiments
In this section, we validate the effectiveness of our NPDNE model. Five experimental datasets are described in Section 4.1. In Section 4.2, we list the evaluation metrics used in experiments and introduce the baseline algorithms in Section 4.3. We analyze the experimental results and investigate the sensitivity of parameters in Section 4.4.
Conclusion
In this paper, we propose NPDNE to learn network embedding representations while preserving node proximities and the evolution of given networks simultaneously. To demonstrate the effectiveness of our method, we conduct four experiments on several real-world networks and the experimental results demonstrate the efficacy of our method. In the future, we will investigate a general framework to supplement our model and extend the NPDNE model to directed networks as well as heterogeneous networks.
CRediT authorship contribution statement
Bin Yu: Supervision, Methodology. Bing Lu: Methodology, Conceptualization, Writing - original draft, Writing - review & editing. Chen Zhang: Conceptualization, Supervision, Writing - original draft. Chunyi Li: Software, Writing - review & editing. Ke Pan: Software, Writing - original draft.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant no. 61502365); the Key Research and Development Program of Shaanxi Province (Grant no. 2019ZDLGY17-01, 2019GY-042);the National Key Research and Development Program of (Grant no. 2019YFC1606705).
References (52)
- et al.
FREERL: Fusion relation embedded representation learning framework for aspect extraction
Knowl. Based Syst.
(2017) - et al.
Using neural word embeddings to model user behavior and detect user segments
Knowl. Based Syst.
(2016) - et al.
Knowledge-enhanced document embeddings for text classification
Knowl. Based Syst.
(2019) - et al.
Community aware random walk for network embedding
Knowl. Based Syst.
(2018) - et al.
Graph embedding techniques, applications, and performance: A survey
Knowl. Based Syst.
(2018) - et al.
Network embedding by fusing multimodal contents and links
Knowl. Based Syst.
(2019) - et al.
Nonnegative low-rank representation based manifold embedding for semi-supervised learning
Knowl. Based Syst.
(2017) - et al.
Networks, crowds, and markets: Reasoning about a highly connected world
Math. Comput. Educ.
(2012) - et al.
Link Mining: Models, Algorithms, and Applications
(2010) - Z. Yang, W.W. Cohen, R. Salakhutdinov, Revisiting semi-supervised learning with graph embeddings, in: Proceedings of...
Heterogeneous information network embedding for meta path based proximity
Fast, warped graph embedding: Unifying framework and one-click algorithm
Cross-domain network representations
Pattern Recognit.
Deepeye: Link prediction in dynamic networks based on non-negative matrix factorization
Big Data Min. Anal.
High-order proximity preserved embedding for dynamic networks
IEEE Trans. Knowl. Data Eng.
Distributed representations of words and phrases and their compositionality
Adv. Neural Inf. Process. Syst.
Cited by (11)
Space-invariant projection in streaming network embedding
2023, Information SciencesETINE: Enhanced Textual Information Network Embedding
2021, Knowledge-Based SystemsRole-based network embedding via structural features reconstruction with degree-regularized constraint
2021, Knowledge-Based SystemsCitation Excerpt :Network structure is so universal that a variety of complex systems can be modeled by it, such as social networks [1] and protein networks [2]. In recent years, network embedding (NE) has aroused considerable interests of researchers [3–5]. It aims to learn low-dimensional representations of nodes or the whole graph while preserving its structure [6,7], which can be applied into a lot of downstream tasks, including node or graph classification [8], community detection [9], social recommendation [10,11], and link prediction [12,13].