Network projection-based edge classification framework for signed networks

https://doi.org/10.1016/j.dss.2020.113321Get rights and content

Highlights

  • Spanning subgraph projection-based novel framework for edge classification.

  • Reduced information loss due to consideration of unlabeled edges.

  • Does not require domain-specific assumptions for edge classification.

  • Utilizes the network structure and edge label information in the network.

Abstract

Many real-world networks have signed relationships between the nodes. Identification of these relationships is an important aspect of decision making. The existing signed relationships in a network may impact the relationships between the other nodes, hence learning from the existing signed relationships in a network can be used for decision making in various mining tasks. These signed networks are getting attention in recent years due to their relevance to many applications such as categorization, recommendation, and relationship discovery in various domains for decision support such as biological, social network analysis, communication and making knowledge graphs. In this work, we focus on edge classification (sign/label prediction for edges) in unweighted and undirected signed networks where the task is to predict the label of the unlabeled edges. Edge classification is a challenging problem as in real-world signed networks, edges are scarcely labeled. In our work, we are using labeled edges to predict the sign of unlabeled edges (classification) with the help of structural information. In this work, we have proposed a novel framework named NPECF for the classification of unlabeled edges. The proposed framework is novel in its way of utilizing the existing information in the signed network to predict the label of unlabeled edges. The utilization of the unlabeled edges in NPECF using three spanning subgraph projections of the given network minimizes the information loss. The experiments have been performed on four real-world datasets from different domains to demonstrate the effectiveness of the proposed framework.

Introduction

With the existence of many signed real-world networks, the prediction of signs of edges has become a very important and valuable decision-making problem as it can help to unfold the potential of real-world networks. Signed real-world networks in different domains exist, such as biology, social network, and communication where the users/objects have a positive or negative relationship with each other. Many times, network structures are not very rich in terms of existing relationship information, still, prediction of sign of unlabeled relationships is required to support the decision-making process in various applications like categorization, recommendation, and relationship discovery [7,19] and knowledge graphs [16]. Besides that, the knowledge graph is being used as a tool for the development of the semantic web. The knowledge graph primarily explores the node and edge network for identifying the relationship between the entities.

Classification of relationships in signed networks has significant managerial implications [1,14]. A positive-labeled edge in signed networks represents the friendship, trust, love or support while a negative-labeled edge represents the enmity, distrust, hate or oppose [1,14]. These signed networks are prevalent and exist in many real-world domains and classification of relationships between users/nodes can add a great value to decision making. For example, Epinions,1 a review website, can be used by users to like or dislike reviews of other users [14]. The network of US senators and their positive and negative relationships inferred from co-sponsorship data [3] is another example of a signed network in the social network domain. In the biology domain, Bornholdt [5] used a signed network to give the simplified representation of the yeast regulatory network where interactions are classified into the two types i.e. activated (positive relationship) or repressed (negative relationship).

Different mining tasks can be performed on signed networks to get rich insights. In this work, our focus is on the problem of edge classification (sign/label prediction for edges) for signed networks. In the edge classification problem, the task is to classify unlabeled edges as either a positive-labeled edge or negative-labeled edge in the network [15,18]. The problem of node classification is a well-known and well-explored problem [1,20]. However, the edge classification problem in a signed network is very challenging and it is relatively less explored as compared to node classification problem [1]. For real-world signed networks, manually labeling of the edges could be a daunting task due to the high cost involved in terms of time and effort. Therefore, we can utilize the information of the labeled edges to perform the edge classification of the unlabeled edges. However, the existing techniques may not be able to accurately classify the edges because real-world networks are highly sparse and very few of the edges are labeled due to the cost involved in it. Due to the scarcity of the labeled edges and high sparsity of the networks, the existing machine learning-based techniques would not be effective for edge classification.

Earlier work on edge classification has utilized the structural balance theory for undirected signed networks for perception and attitude of individuals [6,10]. Yang et al. [24] and Leskovec et al. [15] are examples of the work where structural balance theory was utilized along with their proposed methodology for edge classification in signed networks. However, these works leveraged the domain-specific characteristics of graph-structured data and their effectiveness depend on the domain-specific assumptions [1]. These domain-specific assumptions are not valid across the domains and that would limit the applicability of these methods [1,12]. Considering this, in this work, we have proposed a novel framework named NPECF for the classification of edges in an unweighted and undirected signed network. This framework is domain-independent and applicable to the arbitrary domain for edge classification in signed networks. The proposed framework NPECF uses the structural information of the network and predicts the sign of the unlabeled edges in the network.

The expected contribution of this study is a novel framework named NPECF for edge classification in a signed network. This framework generates three spanning subgraph projections of the given signed network in such a way that each projection utilizes unlabeled edges to reduce the information loss while predicting the labels. Using these projections, the NPECF framework computes the pairwise similarities of nodes in the network. The similarity scores of two projections that contain only positive edges and negative edges separately along with the unlabeled edges are compared with the third projection which is the given network but without label information. By these two comparisons, the information loss in terms of node similarity is found. The projection which has lower information loss in terms of node similarity for an unlabeled edge is utilized to predict the label of that edge. The proposed framework is domain-independent as it works considering network structure and edge label information present in the network without specific assumptions about the problem domain or characteristics of graphs.

The remaining of the paper is as follows. In Section 2, the related work is presented. In Section 3, the background and problem definition for the edge classification is presented. The proposed framework NPECF for edge classification is presented in Section 4. Experimental setup and results are discussed in Section 5 and Section 6 respectively. Finally, we conclude the paper with future research directions in Section 7.

Section snippets

Related work

Many data mining tasks can be performed on signed networks, such as node ranking, edge prediction, information diffusion, edge classification (sign/label prediction for edges), negative sign prediction for edges [18,25]. Among these tasks, in the present study, our focus is on the task of edge classification in signed networks which is a link-oriented task. Various approaches based on the behavior-relation interplay (BRI) model have been proposed for edge classification in signed networks [2,4,

Background and preliminaries

To understand the methodology of the proposed framework NPECF, we first need to understand the mathematical notations used. For that, in this section, the background, preliminaries and important definitions are presented. Some important and frequently utilized notations are listed in Table 1.

To have the understanding of a signed network, they are formally defined in Definition 1.

NPECF: framework for edge classification

In this section, we discuss the proposed framework for binary classification of unlabeled edges in the network. We also give an illustration to understand the working of the proposed framework.

Experimental setup

In this section, we discuss the experimental setup to perform a set of experiments on four real-world datasets from different domains. All experiments were performed on a system using R version 3.6.0.

Experimental results and discussion

In this section, we present the experimental results of the four datasets i.e. Epinions, Slashdot Zoo, Wikipedia RfA, and Yeast GIN for both the balanced and imbalanced cases.

Conclusion and future research directions

In this work, we have proposed a network projection-based framework named NPECF for binary classification of edges in unweighted and undirected signed networks. The proposed framework is effective in the prediction of signs of unlabeled edges. This framework utilizes three projections of the given network with very few edges signed to predict the sign of the unlabeled edges in the network. The proposed framework can utilize the information of the unlabeled edges in the network along with the

Mukul Gupta is currently working as an Assistant Professor in Information Systems area at Indian Institute of Management Indore, India. He received his Ph.D. in Information Technology and Systems area from Indian Institute of Management Lucknow, India. He did his M.Tech from Dayalbagh Educational Institute, India in Computer Science and B.Tech in Computer Science and Engineering. His current research interest includes e-Commerce, Recommendation Systems, Information Networks, Machine Learning,

References (26)

  • R. Guha et al.

    Propagation of trust and distrust

  • J. Han et al.

    Classification: basic concepts

  • F. Heider

    Attitudes and cognitive organization

    J. Psychol.

    (1946)
  • Cited by (4)

    • Integrating social influence modeling and user modeling for trust prediction in signed networks

      2022, Artificial Intelligence
      Citation Excerpt :

      Besides, most of the aforementioned works still focus on the classification over trust and distrust and no-relation status is ignored. Third, few approaches that consider the no-relation status (e.g., [57,19]) ignore to combine user modeling with social influence modeling. There are many applications for trust prediction methods such as social recommendation [15].

    • Spreading the information in complex networks: Identifying a set of top-N influential nodes using network structure

      2021, Decision Support Systems
      Citation Excerpt :

      Thus, these measures are applied effectively if domain-specific information for the network is available. However, for many real-world networks, domain-specific information is not available or is difficult to acquire, making it difficult to choose the appropriate conventional centrality measures to find the influential nodes [5,19]. Various approaches have been proposed to find the influential nodes in real-world complex networks considering only the network structure.

    Mukul Gupta is currently working as an Assistant Professor in Information Systems area at Indian Institute of Management Indore, India. He received his Ph.D. in Information Technology and Systems area from Indian Institute of Management Lucknow, India. He did his M.Tech from Dayalbagh Educational Institute, India in Computer Science and B.Tech in Computer Science and Engineering. His current research interest includes e-Commerce, Recommendation Systems, Information Networks, Machine Learning, Social Media Analytics, Web and Data Mining.

    Rajhans Mishra is an Associate Professor in Information Systems Area at Indian Institute of Management Indore (India). He has also served as a visiting faculty at Indian Institute of Management Ahmedabad and Indian Institute of Management Lucknow. His research interest includes recommendation systems, web mining, data mining, text mining, e-Governance and business analytics.

    View full text