Elsevier

Neurocomputing

Volume 257, 27 September 2017, Pages 175-184
Neurocomputing

Sign prediction in social networks based on tendency rate of equivalent micro-structures

https://doi.org/10.1016/j.neucom.2016.12.069Get rights and content

Abstract

Online social networks have significantly changed the way people shape their everyday communications. Signed networks are a class of social networks in which relations can be positive or negative. These networks emerge in areas where there is interplay between opposite attitudes such as trust and distrust. Recent studies have shown that sign of relationships is predictable using data already present in the network. In this work, we study the sign prediction problem in networks with both positive and negative links and investigate the application of network tendency in the prediction task. Accordingly, we develop a simple algorithm that can infer unknown relation types with high performance. We conduct experiments on three real-world signed networks: Epinions, Slashdot and Wikipedia. Experimental results indicate that the proposed approach outperforms the state of the art methods in terms of both overall accuracy and true negative rate. Furthermore, significantly low computational complexity of the proposed algorithm allows applying it to large-scale datasets.

Introduction

Communication has experienced an advanced stage by online social networks and billions of people are in contact around the world by means of them. They tend to exchange information according to some similarities such as common interests, cognation, friendship and acquaintanceship. However, like other societies, there may exist conflicts among individuals. Signed networks are a class of social networks which aim to model this opposition. In such networks, existing relations are explicitly categorized as positive or negative. Positive interactions may represent friendship, interest or trust while negative interactions specify enmity, disinterest or distrust [7], [18], [20]. Some popular signed networks are Epinions, Slashdot, Wikipedia, Amazon, Ebay and Advogato.

One of the challenges in signed networks is to infer the type of unknown relations that is often referred to as sign prediction [18], [19]. Sign prediction is similar to the link prediction which is a well-studied problem in social network analysis [2], [21], [22]. However, it has not been investigated until recent years. This problem was first introduced by Guha et al. [11] and later developed by Leskovec et al. [19], [20]. Generally, approaches in this area can be categorized into some groups. Some algorithms are based on social theories and apply their predefined constraints to decide on unknown relations [20], [24]. Some others are based on matrix-factorization to extract the kernel of adjacency matrix and to perform algebraic operators to get the prediction results [1], [5], [11], [14], [18]. Other algorithms are based on machine learning techniques. They build different sets of features using implicit network characteristics and use them in a supervised classification task [6], [19], [25]. Finally, There are some studies which follow recommender system’s methodologies [15], [26]. The main challenge addressed in these studies is to find a solution which can infer the signs with less complexity and better accuracy that scales well with the properties of real networks. In addition, as these networks mostly comprise positive relations, the solution should distinguish negative relations as well.

To tackle these challenges, in this work we developed an algorithm based on tendency rate of triple micro-structures in a signed network. We studied different possible configurations that can be made using signed relations between three individuals. Then, we devised a probabilistic framework to compute network tendency towards each configuration. Having this global trend, we measured implicit forces that direct the sign of each relation in its neighborhood. Our work shows the important role of reciprocated relations and inefficiency of social theories to accurately model them. It defines closed triple micro-structures and discriminates them in equivalence classes. The probabilistic approach is introduced to measure the interplay between various configurations. Finally the performance of this approach is analysed and compared with a number of naive and state of the art approaches.

Considering trust and distrust explicitly to predict the type of relations between users in a real large network, was first implemented by Guha et al. [6], [9], [11]. They used trust and distrust between users as a real number between 0 and 1 in two distinct matrices and built belief matrix based on them. Then, they defined four types of atomic propagation as basic operations to model trust propagation between nodes. They encoded these atomic propagations as a matrix operator and applied them to the belief matrix in a sequence to get the final results [11]. Inspired by spreading activation models, Ziegler et al. proposed Appleseed propagation model as a classification scheme for trust metrics. Later, they extended this metric for distrust and compared it with Advogato, another trust metric, to compute trust neighborhoods [28]. Shahriari and Jalili investigated node ranking algorithms in signed networks. They defined optimism and reputation based on node ranks. Using these metrics for each side of a relation, they built a four dimension feature vector and used it in logistic classification approach to predict signs [25].

Leskovec et al. studied the structure of signed networks and investigated compatibility of two social theories with the structure of signed networks [20]. Based on these theories they built some feature vectors and used it in a machine learning approach to solve the sign prediction problem. Dubois et al. considered trust values present in the network as the probability of having an edge between the nodes in a random graph. They performed a reverse mapping where trust values of each pair corresponds to the probability of having a path between them. Path probability and spring embedding values were used in a two dimensional vector to position trust value of each edge [9]. Chiang et al. defined some measures of social imbalance based on higher order cycles (i.e. cycles with length more than three) and used it to predict relation types. They showed that their imbalance measure plays the role of Katz measure in signed networks. They also performed a supervised learning classifier (like the one proposed in [19]) based on these measures and reported the results [6]. Yang et al. devised a latent factor model (called Behavior Relation Interplay) which exploits users’ activity (e.g. movie ratings) to determine social signed ties in unsigned networks. They used Pearson correlation score to assess similarity between ratings [26].

Some other recent studies are connected to Social theories. Patidar et al. initially generated a decision tree using C4.5 and user attributes (e.g., gender, career) to induce relation categories. Then, they employed balance index (proportion of balanced triads to total triads) to identify ties which increase the balance [23]. Qian and Adali extended balance theory by considering weak/strong ties. They discussed different configurations and participants behavior with regards to their tolerance and network’s stress. Accordingly, they suggested a convergence model in the form of Metric Multidimensional Scaling optimization problem and predicted edge signs based on this model [24]. In another work, Javari and Jalili followed a collaborative filtering approach by first discovering the community structure of the signed networks, and then, applying the collaborative filtering framework. Their approach, being much less complex than machine learning methods, gives comparable results with them [15].

There are some studies which adopt a matrix kernel approach. Kunegis et al. investigated some network characteristics on Slashdot, such as clustering coefficient, centrality and popularity of nodes. They used these measures to identify unpopular users and predicted the link signs. They used exponentials of the adjacency matrix to predict the sign of edges. They also used dimensionality reduction and Laplacian matrix as alternative ways to do the task [18]. Ye et al. employed transfer learning to use information available in a source network to predict link signs of a target network. They built explicit (e.g., node degree, edge embeddedness) and latent topological features and employed an AdaBoost-like algorithm for the prediction [27]. In other studies, sign prediction is investigated as a matrix completion problem and a matrix factorization technique is employed to solve it [1], [5], [13], [14]. For example, Chiang et al. verified the low rank property of signed networks studying the rank of adjacency matrix of a complete k-weakly balanced network. Then, they employed a gradient-based matrix factorization method to provide an approximate completion of the network. [5].

Section snippets

Preliminaries

We follow the formulation introduced in [6] to formally define the sign prediction problem. We model signed networks by a graph G=(V,E,Σ) where V={1,2,...,|V|} is a set of nodes or vertices and E={e1,e2,...|E|} is a set of directed edges in the form (i, j) for i, jV and Σ is a mapping Σ:E{+1,1} giving a sign to each edge. The sign prediction problem can be formally defined as follows. Given a graph G=(V,E,Σ) and a test edge etestE, we want to predict Σ(etest) based on the edges in E{ete

Characteristics of the datasets

Three online social networks are analysed in this paper: Epinions, Slashdot and Wikipedia1. Epinions is a general consumer review website. In its communities, each user explicitly mentions others as trusty or untrustworthy according to their product reviews. This attitudes can be presented by directed signed links. This dataset contains 131,828 nodes and 841,372 edges while a majority of 85.3% edges are positive. Additionally, 30.8% of

An algorithm based on tendency rate

In this section we introduce two sign prediction approaches. The first one is our proposed algorithm which considers structural properties of the network to decide on the type of new relations. The second one is a set of naive but useful metrics as a baseline to compare with the proposed algorithm. To better explain the former approach, we need to explore various micro-structures which have impacts on the signs, and measure their behavior across the network considering global tendency towards

Experiments

We introduced two approaches for the sign prediction. One is based on CTMSs and the other is based on naive metrics. In this section, we benchmark their performance and compare it with the logistic regression (LR) proposed by Leskovec et al. [19]3, the collaborative filtering (CF) introduced by Javari and Jalili

Conclusion

Considering the important role of reciprocated relations in signed networks and inefficiency of dominant social theories to model them, we introduced an algorithm to incorporate all possible forms of closed triple micro-structures to solve the sign prediction problem. The conducted experiments in different setups showed that the network’s implicit tendency towards each class of these micro-structures and their local pattern around each relation is a strong sign determiner. We applied the

Abtin Khodadadi received his MSc in Computer Since from Institute of Advanced Studies in Basic Sciences, Zanjan. His research interests are in machine learning and social network analysis. He is now working towards his PhD in Oregon State University.

References (28)

  • M. Shahriari et al.

    Ranking nodes in signed social networks

    Soc. Netw. Anal. Min.

    (2014)
  • P. Agrawal et al.

    Link label prediction in signed social networks

    Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence

    (2013)
  • M. Al Hasan et al.

    A survey of link prediction in social networks

    Social Network Data Analytics

    (2011)
  • D. Cartwright et al.

    Structural balance: a generalization of heider’s theory

    Psychol. Rev.

    (1956)
  • G. Chartrand et al.

    Graphs & Digraphs

    (2010)
  • ChiangK.-Y. et al.

    Prediction and clustering in signed networks: a local to global perspective

    J. Mach. Learn. Res.

    (2014)
  • ChiangK.-Y. et al.

    Exploiting longer cycles for link prediction in signed networks

    Proceedings of the 20th ACM International Conference on Information and Knowledge Management

    (2011)
  • E. David et al.

    Networks, Crowds, and Markets: Reasoning About a Highly Connected World

    (2010)
  • J.A. Davis

    Clustering and structural balance in graphs

    Hum. Relat.

    (1967)
  • T. DuBois et al.

    Predicting trust and distrust in social networks

    Proceedings of 3rd International Conference on Privacy, Security, Risk and Trust (passat), and 3rd International Conference on Social Computing (socialcom)

    (2011)
  • J.L. Gross et al.

    Graph Theory and Its Applications, Second Edition (Discrete Mathematics and Its Applications)

    (2005)
  • R. Guha et al.

    Propagation of trust and distrust

    Proceedings of the 13th International Conference on World Wide Web

    (2004)
  • F. Heider

    Attitudes and cognitive organization

    J. Psychol.

    (1946)
  • HsiehC.-J. et al.

    Low rank modeling of signed networks

    Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2012)
  • Cited by (24)

    • Network effects in influenza spread: The impact of mobility and socio-economic factors

      2021, Socio-Economic Planning Sciences
      Citation Excerpt :

      They use surprise analysis to determine the likelihood of appearance of particular triads in trust networks, and then, propose a theoretical explanation to their observations. The analyses are conducted with the excerpts of large online social networks, namely, Epinions, Slashdot, and Wikipedia [5,14,15]. They take a network, and analyze the counts of different triad types in the network, in comparison to the counts of these triads in the permuted variations of the same network.

    • A Granular Functional Network with delay: Some dynamical properties and application to the sign prediction in social networks

      2018, Neurocomputing
      Citation Excerpt :

      For instance, in [2], the authors showed that the first representation ensures a high accuracy, by using a TDNN, though they considered a limited dataset. In [21], the authors considered only the second representation, being the most suitable one for their semi-probabilistic approach, achieving also a high accuracy. From a topological perspective, it is well-known the importance of the triadic closure, expressing the fact that at two different instants, a certain number of new edges have been formed through a triangle-closing operation, between two people who had previously a common neighbour.

    • A novel evolutionary algorithm on communities detection in signed networks

      2018, Physica A: Statistical Mechanics and its Applications
      Citation Excerpt :

      The positive and negative edges in signed networks [1] can describe the cooperative (friendly/trust) and competitive (hostile/distrust) relationships more accurately compared to the traditional networks [2–4]. Signed networks have attracted much attention in recent years such as prediction [5–7], clustering [8,9] and evolution [10] etc. Communities in signed networks require positive intra-community relationships and negative inter-community relationships [1,11], which is different from reference [12] etc.

    • SigGAN: Adversarial Model for Learning Signed Relationships in Networks

      2023, ACM Transactions on Knowledge Discovery from Data
    View all citing articles on Scopus

    Abtin Khodadadi received his MSc in Computer Since from Institute of Advanced Studies in Basic Sciences, Zanjan. His research interests are in machine learning and social network analysis. He is now working towards his PhD in Oregon State University.

    Mahdi Jalili received his B.S. degree in Electrical Engineering from Tehran Polytechnique in 2001, his M.S. degree in Electrical Engineering from the University of Tehran in 2004, and his PhD from Swiss Federal Institute of Technology Lausanne (EPFL) in 2008. He then joined Sharif University of Technology as Assistant professor. He is now Senior Lecturer at RMIT University, Melbourne, Australia and holds Australian Research Council DECRA Fellowship and RMIT Vice-Chancellor’s Research Fellowship. His research interests are in network science, dynamical systems, social networks analysis and mining, and human brain functional connectivity analysis.

    This research was supported by Australian Research Council through grant No. DE140100620 to Mahdi Jalili.

    View full text