Sign prediction in social networks based on tendency rate of equivalent micro-structures

doi:10.1016/j.neucom.2016.12.069

Neurocomputing

Volume 257, 27 September 2017, Pages 175-184

https://doi.org/10.1016/j.neucom.2016.12.069 Get rights and content

Abstract

Online social networks have significantly changed the way people shape their everyday communications. Signed networks are a class of social networks in which relations can be positive or negative. These networks emerge in areas where there is interplay between opposite attitudes such as trust and distrust. Recent studies have shown that sign of relationships is predictable using data already present in the network. In this work, we study the sign prediction problem in networks with both positive and negative links and investigate the application of network tendency in the prediction task. Accordingly, we develop a simple algorithm that can infer unknown relation types with high performance. We conduct experiments on three real-world signed networks: Epinions, Slashdot and Wikipedia. Experimental results indicate that the proposed approach outperforms the state of the art methods in terms of both overall accuracy and true negative rate. Furthermore, significantly low computational complexity of the proposed algorithm allows applying it to large-scale datasets.

Introduction

Communication has experienced an advanced stage by online social networks and billions of people are in contact around the world by means of them. They tend to exchange information according to some similarities such as common interests, cognation, friendship and acquaintanceship. However, like other societies, there may exist conflicts among individuals. Signed networks are a class of social networks which aim to model this opposition. In such networks, existing relations are explicitly categorized as positive or negative. Positive interactions may represent friendship, interest or trust while negative interactions specify enmity, disinterest or distrust [7], [18], [20]. Some popular signed networks are Epinions, Slashdot, Wikipedia, Amazon, Ebay and Advogato.

One of the challenges in signed networks is to infer the type of unknown relations that is often referred to as sign prediction [18], [19]. Sign prediction is similar to the link prediction which is a well-studied problem in social network analysis [2], [21], [22]. However, it has not been investigated until recent years. This problem was first introduced by Guha et al. [11] and later developed by Leskovec et al. [19], [20]. Generally, approaches in this area can be categorized into some groups. Some algorithms are based on social theories and apply their predefined constraints to decide on unknown relations [20], [24]. Some others are based on matrix-factorization to extract the kernel of adjacency matrix and to perform algebraic operators to get the prediction results [1], [5], [11], [14], [18]. Other algorithms are based on machine learning techniques. They build different sets of features using implicit network characteristics and use them in a supervised classification task [6], [19], [25]. Finally, There are some studies which follow recommender system’s methodologies [15], [26]. The main challenge addressed in these studies is to find a solution which can infer the signs with less complexity and better accuracy that scales well with the properties of real networks. In addition, as these networks mostly comprise positive relations, the solution should distinguish negative relations as well.

To tackle these challenges, in this work we developed an algorithm based on tendency rate of triple micro-structures in a signed network. We studied different possible configurations that can be made using signed relations between three individuals. Then, we devised a probabilistic framework to compute network tendency towards each configuration. Having this global trend, we measured implicit forces that direct the sign of each relation in its neighborhood. Our work shows the important role of reciprocated relations and inefficiency of social theories to accurately model them. It defines closed triple micro-structures and discriminates them in equivalence classes. The probabilistic approach is introduced to measure the interplay between various configurations. Finally the performance of this approach is analysed and compared with a number of naive and state of the art approaches.

Considering trust and distrust explicitly to predict the type of relations between users in a real large network, was first implemented by Guha et al. [6], [9], [11]. They used trust and distrust between users as a real number between 0 and 1 in two distinct matrices and built belief matrix based on them. Then, they defined four types of atomic propagation as basic operations to model trust propagation between nodes. They encoded these atomic propagations as a matrix operator and applied them to the belief matrix in a sequence to get the final results [11]. Inspired by spreading activation models, Ziegler et al. proposed Appleseed propagation model as a classification scheme for trust metrics. Later, they extended this metric for distrust and compared it with Advogato, another trust metric, to compute trust neighborhoods [28]. Shahriari and Jalili investigated node ranking algorithms in signed networks. They defined optimism and reputation based on node ranks. Using these metrics for each side of a relation, they built a four dimension feature vector and used it in logistic classification approach to predict signs [25].

Leskovec et al. studied the structure of signed networks and investigated compatibility of two social theories with the structure of signed networks [20]. Based on these theories they built some feature vectors and used it in a machine learning approach to solve the sign prediction problem. Dubois et al. considered trust values present in the network as the probability of having an edge between the nodes in a random graph. They performed a reverse mapping where trust values of each pair corresponds to the probability of having a path between them. Path probability and spring embedding values were used in a two dimensional vector to position trust value of each edge [9]. Chiang et al. defined some measures of social imbalance based on higher order cycles (i.e. cycles with length more than three) and used it to predict relation types. They showed that their imbalance measure plays the role of Katz measure in signed networks. They also performed a supervised learning classifier (like the one proposed in [19]) based on these measures and reported the results [6]. Yang et al. devised a latent factor model (called Behavior Relation Interplay) which exploits users’ activity (e.g. movie ratings) to determine social signed ties in unsigned networks. They used Pearson correlation score to assess similarity between ratings [26].

Some other recent studies are connected to Social theories. Patidar et al. initially generated a decision tree using C4.5 and user attributes (e.g., gender, career) to induce relation categories. Then, they employed balance index (proportion of balanced triads to total triads) to identify ties which increase the balance [23]. Qian and Adali extended balance theory by considering weak/strong ties. They discussed different configurations and participants behavior with regards to their tolerance and network’s stress. Accordingly, they suggested a convergence model in the form of Metric Multidimensional Scaling optimization problem and predicted edge signs based on this model [24]. In another work, Javari and Jalili followed a collaborative filtering approach by first discovering the community structure of the signed networks, and then, applying the collaborative filtering framework. Their approach, being much less complex than machine learning methods, gives comparable results with them [15].

There are some studies which adopt a matrix kernel approach. Kunegis et al. investigated some network characteristics on Slashdot, such as clustering coefficient, centrality and popularity of nodes. They used these measures to identify unpopular users and predicted the link signs. They used exponentials of the adjacency matrix to predict the sign of edges. They also used dimensionality reduction and Laplacian matrix as alternative ways to do the task [18]. Ye et al. employed transfer learning to use information available in a source network to predict link signs of a target network. They built explicit (e.g., node degree, edge embeddedness) and latent topological features and employed an AdaBoost-like algorithm for the prediction [27]. In other studies, sign prediction is investigated as a matrix completion problem and a matrix factorization technique is employed to solve it [1], [5], [13], [14]. For example, Chiang et al. verified the low rank property of signed networks studying the rank of adjacency matrix of a complete k-weakly balanced network. Then, they employed a gradient-based matrix factorization method to provide an approximate completion of the network. [5].

Section snippets

Preliminaries

We follow the formulation introduced in [6] to formally define the sign prediction problem. We model signed networks by a graph $G = (V, E, Σ)$ where $V = {1, 2, . . ., | V |}$ is a set of nodes or vertices and $E = {e_{1}, e_{2}, . . . | E |}$ is a set of directed edges in the form (i, j) for i, j ∈ V and Σ is a mapping $Σ : E \to {+ 1, - 1}$ giving a sign to each edge. The sign prediction problem can be formally defined as follows. Given a graph $G = (V, E, Σ)$ and a test edge e_test ∈ E, we want to predict Σ(e_test) based on the edges in $E - {e_{t e}$

Characteristics of the datasets

Three online social networks are analysed in this paper: Epinions, Slashdot and Wikipedia¹. Epinions is a general consumer review website. In its communities, each user explicitly mentions others as trusty or untrustworthy according to their product reviews. This attitudes can be presented by directed signed links. This dataset contains 131,828 nodes and 841,372 edges while a majority of 85.3% edges are positive. Additionally, 30.8% of

An algorithm based on tendency rate

In this section we introduce two sign prediction approaches. The first one is our proposed algorithm which considers structural properties of the network to decide on the type of new relations. The second one is a set of naive but useful metrics as a baseline to compare with the proposed algorithm. To better explain the former approach, we need to explore various micro-structures which have impacts on the signs, and measure their behavior across the network considering global tendency towards

Experiments

We introduced two approaches for the sign prediction. One is based on CTMSs and the other is based on naive metrics. In this section, we benchmark their performance and compare it with the logistic regression (LR) proposed by Leskovec et al. [19]³, the collaborative filtering (CF) introduced by Javari and Jalili

Conclusion

Considering the important role of reciprocated relations in signed networks and inefficiency of dominant social theories to model them, we introduced an algorithm to incorporate all possible forms of closed triple micro-structures to solve the sign prediction problem. The conducted experiments in different setups showed that the network’s implicit tendency towards each class of these micro-structures and their local pattern around each relation is a strong sign determiner. We applied the

Abtin Khodadadi received his MSc in Computer Since from Institute of Advanced Studies in Basic Sciences, Zanjan. His research interests are in machine learning and social network analysis. He is now working towards his PhD in Oregon State University.

References (28)

M. Shahriari et al.
Ranking nodes in signed social networks
Soc. Netw. Anal. Min.
(2014)
P. Agrawal et al.
Link label prediction in signed social networks
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence
(2013)
M. Al Hasan et al.
A survey of link prediction in social networks
Social Network Data Analytics
(2011)
D. Cartwright et al.
Structural balance: a generalization of heider’s theory
Psychol. Rev.
(1956)
G. Chartrand et al.
Graphs & Digraphs
(2010)
ChiangK.-Y. et al.
Prediction and clustering in signed networks: a local to global perspective
J. Mach. Learn. Res.
(2014)
ChiangK.-Y. et al.
Exploiting longer cycles for link prediction in signed networks
Proceedings of the 20th ACM International Conference on Information and Knowledge Management
(2011)
E. David et al.
Networks, Crowds, and Markets: Reasoning About a Highly Connected World
(2010)
J.A. Davis
Clustering and structural balance in graphs
Hum. Relat.
(1967)
T. DuBois et al.
Predicting trust and distrust in social networks
Proceedings of 3rd International Conference on Privacy, Security, Risk and Trust (passat), and 3rd International Conference on Social Computing (socialcom)
(2011)

J.L. Gross et al.

Graph Theory and Its Applications, Second Edition (Discrete Mathematics and Its Applications)

(2005)

R. Guha et al.

Propagation of trust and distrust

Proceedings of the 13th International Conference on World Wide Web

(2004)

F. Heider

Attitudes and cognitive organization

J. Psychol.

(1946)

HsiehC.-J. et al.

Low rank modeling of signed networks

Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

(2012)

Cited by (24)

An unclosed structures-preserving embedding model for signed networks
2024, Neurocomputing
Signed network embedding has sparked substantial attention since it learns a low-dimensional representation of signed networks. However, most existing methods overestimate the triadic interaction among nodes based on structural balance theory, leaving numerous unclosed structures in a distorted status, which hampers the accurate prediction of future link polarity and exacerbates polarization within networks. To address this issue, we propose the three link statuses preserving embedding model of signed networks, 3LP-SNE, which considers the no-link as a ”special link status” between positive and negative links. Our model captures the structural discrepancies inherent in three link statuses from fundamental binary relations, thereby preserving more complex structures. Specifically, we construct a mapping of three link statuses and distance intervals between nodes combining a three-way decision and a hyperbolic generative model of signed networks. Based on this, the three link statuses preserving is transformed into corresponding distance intervals preserving. Finally, a weighted likelihood function is designed for handling inter-class imbalanced problem and a corresponding optimization algorithm is developed to prevent the maximization of the likelihood function from converging into numerous local optima. The results of the link sign prediction task indicate the value of considering the no-link status and demonstrate the efficacy of our model in preserving primitive structures of signed networks.
Network effects in influenza spread: The impact of mobility and socio-economic factors
2021, Socio-Economic Planning Sciences
Citation Excerpt :
They use surprise analysis to determine the likelihood of appearance of particular triads in trust networks, and then, propose a theoretical explanation to their observations. The analyses are conducted with the excerpts of large online social networks, namely, Epinions, Slashdot, and Wikipedia [5,14,15]. They take a network, and analyze the counts of different triad types in the network, in comparison to the counts of these triads in the permuted variations of the same network.
This paper introduces new methods of modeling and analyzing social networks that emerge in the context of disease spread. Four methods of constructing informative networks are presented, two of which use. static data and two use temporal data, namely individual citizen mobility observations taken over an extensive period of time. We show how the built networks can be analyzed, and how the numerical results can be interpreted, using network permutation-based surprise analysis. In doing so, we explain the relationship of surprise analysis with conventional network hypothesis testing and Quadratic Assignment Procedure regression. Surprise analysis is more comprehensive, and can be without limitation performed with any form(s) of network subgraphs, including those with multiple nodal attributes, weighted links, and temporal features. To illustrate our methodological work in application, we put them to use for interpreting networks constructed from the data collected over one year in an observational study in Buffalo and Erie counties in New York state during the 2016–2017 influenza season. Even with the limitations in the data size, our methods are able to reveal the global (city- and season-wide) patterns in the spread of influenza, taking into account population mobility and socio-economic factors.
A Granular Functional Network with delay: Some dynamical properties and application to the sign prediction in social networks
2018, Neurocomputing
Citation Excerpt :
For instance, in [2], the authors showed that the first representation ensures a high accuracy, by using a TDNN, though they considered a limited dataset. In [21], the authors considered only the second representation, being the most suitable one for their semi-probabilistic approach, achieving also a high accuracy. From a topological perspective, it is well-known the importance of the triadic closure, expressing the fact that at two different instants, a certain number of new edges have been formed through a triangle-closing operation, between two people who had previously a common neighbour.
In this paper, we propose a general scheme of Functional Network, by considering granularity of information and time delay. Functional Networks (FNs) are a relatively recent alternative to standard Neural Networks (NNs). They have shown better performance in comparison to performance of NNs. Data granulation used in the development of NNs allows for the formation of more efficient and transparent architectures. Time delay models have been recognized to be more realistic constructs of real-world systems. By keeping these observations in mind, we revise the usual design scheme of FN by casting it in the settings of information granules, defining a different learning algorithm, and by introducing time delay. Under some assumptions, we discuss some dynamical properties of the proposed model, in particular those concerning asymptotic stability and Neimark–Sacker bifurcation. Finally, we present an application of the proposed method to the problem of sign prediction in social networks. The results reported against those obtained by the state-of-the-art method show good performance of the proposed approach.
A novel evolutionary algorithm on communities detection in signed networks
2018, Physica A: Statistical Mechanics and its Applications
Citation Excerpt :
The positive and negative edges in signed networks [1] can describe the cooperative (friendly/trust) and competitive (hostile/distrust) relationships more accurately compared to the traditional networks [2–4]. Signed networks have attracted much attention in recent years such as prediction [5–7], clustering [8,9] and evolution [10] etc. Communities in signed networks require positive intra-community relationships and negative inter-community relationships [1,11], which is different from reference [12] etc.
A Community detection in signed networks is a partition on nodes such that the intra-community edges are positive and the inter-community edges are negative. The communities detection had been solved by Harary and Davis when a signed graph is balanced or weak balanced. While communities detection become much complex when a signed network is imbalanced. In this paper, a novel evolution algorithm is presented on community detection in imbalanced signed networks which can be modeled as an optimal partition problem. And the evolving mechanism of nodes is updated by its neighbors’ information which leads to form optimal community structure. The effectiveness of the algorithm is proved by experiments both on real-world and synthetic networks. The comparison with other algorithms by some parameters showed that the evolution algorithm is effective and accurate.
Machine learning and signal processing for big multimedia analysis
2017, Neurocomputing
SigGAN: Adversarial Model for Learning Signed Relationships in Networks
2023, ACM Transactions on Knowledge Discovery from Data

View all citing articles on Scopus

Mahdi Jalili received his B.S. degree in Electrical Engineering from Tehran Polytechnique in 2001, his M.S. degree in Electrical Engineering from the University of Tehran in 2004, and his PhD from Swiss Federal Institute of Technology Lausanne (EPFL) in 2008. He then joined Sharif University of Technology as Assistant professor. He is now Senior Lecturer at RMIT University, Melbourne, Australia and holds Australian Research Council DECRA Fellowship and RMIT Vice-Chancellor’s Research Fellowship. His research interests are in network science, dynamical systems, social networks analysis and mining, and human brain functional connectivity analysis.

^☆: This research was supported by Australian Research Council through grant No. DE140100620 to Mahdi Jalili.

View full text

Sign prediction in social networks based on tendency rate of equivalent micro-structures☆

Abstract

Introduction

Section snippets

Preliminaries

Characteristics of the datasets

An algorithm based on tendency rate

Experiments

Conclusion

Soc. Netw. Anal. Min.

Link label prediction in signed social networks

Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence

A survey of link prediction in social networks

Social Network Data Analytics

Structural balance: a generalization of heider’s theory

Psychol. Rev.

Graphs & Digraphs

Prediction and clustering in signed networks: a local to global perspective

J. Mach. Learn. Res.

Exploiting longer cycles for link prediction in signed networks

Proceedings of the 20th ACM International Conference on Information and Knowledge Management

Networks, Crowds, and Markets: Reasoning About a Highly Connected World

Clustering and structural balance in graphs

Hum. Relat.

Predicting trust and distrust in social networks

Proceedings of 3rd International Conference on Privacy, Security, Risk and Trust (passat), and 3rd International Conference on Social Computing (socialcom)