iSim: An Efficient Integrated Similarity Based Collaborative Filtering Approach for Trust Prediction in Service-Oriented Social Networks

Liao, Mingding; Liu, Xiao; Gao, Xiaofeng; Zhong, Jiaofei; Chen, Guihai

doi:10.1007/978-3-319-46295-0_31

Mingding Liao¹⁷,
Xiao Liu¹⁷,
Xiaofeng Gao¹⁷,
Jiaofei Zhong¹⁸ &
…
Guihai Chen¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9936))

Included in the following conference series:

International Conference on Service-Oriented Computing

2720 Accesses

Abstract

Service-oriented social networks gain increasing popularity among a huge user base in recent years. In social networks, trust prediction is significant for recommendations of high-quality service providers as well as in many other applications. In the literature, trust prediction problem can be solved by several strategies, such as matrix factorization, trust propagation, and K-NN search, etc. However, most of the existing works have not considered the possible complementarity among these mainstream strategies to optimize their effectiveness and efficiency. In this paper, we propose a novel trust prediction approach named iSim: an integrated similarity based collaborative filtering approach leveraging on user similarity, which integrates three kinds of factors to measure user similarity, including vector space similarity, matrix factorization, and propagated trust. This paper is the first work in the literature employing matrix factorization and propagated trust in the study of similarity. Additionally, we use several methods like adding inverted index to reduce the time complexity of iSim, and provide its theoretical time bound. Finally, the extensive experiments with real-world dataset show that iSim achieves great improvement for both efficiency and effectiveness over the state-of-the-art approaches.

This work has been supported in part by the China 973 project (2014CB340303), National Natural Science Foundation of China (Grant numbers 61133006, 61472252), the Opening Project of Key Lab of Information Network Security of Ministry of Public Security (The Third Research Institute of Ministry of Public Security) Grant number C15602, the Opening Project of Baidu (Grant number 181515P005267), the Open Project Program of Shanghai Key Laboratory of Data Science (Grant number 201609060001).

You have full access to this open access chapter, Download conference paper PDF

Stochastic trust network enriched by similarity relations to enhance trust-aware recommendations

Article 31 August 2018

Generating Recommendations via Trust-Aware Recommendation System by the Topological Impact of Users in Social Trust Networks

Recommendation algorithm of influence and trust relationship

Article 28 February 2022

Keywords

1 Introduction

The rapid development of online social network makes it an important platform for services, which leads to service-oriented social networks like Epinion.com. Users expect to obtain high-quality service through such platforms, and they usually make their choices according to the credits of service providers. Thus, trustworthiness has been considered as one of the most important factors for recommendations on high-quality service providers.

Some trust information might be gathered explicitly through targeted user survey feedback or implicitly through user behavioral analytics. However, a user can get in touch with only a few other users, so the trust information that can be collected is insufficient, which leads to a challenging problem of trust prediction to infer the unknown trust between any two users.

More specifically, in a service-oriented social network trust model, each user is represented by a node in a graph, while the trust relationship and the amount of trust is indicated by an weighted edge between two users. Since most of the pairwise trust ratings and reliability factors of the users are unknown, if we consider a sparse $n \times n$ trust rating matrix T for users, the objective of trust prediction problem is to predict the unknown values of matrix using some evidence derived from user behaviors or other metrics.

Trust prediction problem can be studied by several strategies, such as trust propagation [2, 8, 11, 15–17], matrix factorization [23, 27, 28], and K-NN search [5, 25], etc. However, the existing methods have many drawbacks: (1) Majority of them ignore the possible complementarity among mainstream strategies, yet it is helpful to solve trust prediction problem. (2) They suffer from low query efficiency in terms of time. Although some approaches speed up by preprocessing, they bear the consequences of high time cost in preprocessing. (3) Their prediction accuracy still have room for improvement.

To resolve the above drawbacks, we propose a novel efficient method named iSim: an integrated similarity based collaborative filtering approach for trust prediction with the help of user similarity. iSim integrates three similarity factors, say, vector space similarity, matrix factorization, and propagated trust, among which matrix factorization and propagated trust are first leveraged in the study of similarity. Next, we involve several methods like adding inverted index to reduce the time complexity of iSim, and compute its theoretical time bound precisely. Finally, we compare iSim with two state-of-the-art trust inference approaches: PROP [7] and SCMF [28] by extensive experiments with real-world dataset. Experiments show that iSim improves the predicting accuracy by $11.9\,\%$–$30.5\,\%$, and its execution time is only $6.8\,\%$ of SCMF and $3.5\,\%$ of PROP.

In all, our contribution is summarized as follows: (1) We propose a novel integrating method based on collaborative filtering for trust prediction. (2) We define a hybrid similarity involving matrix factorization and propagated trust, which might be generalized to other related fields. (3) We have conducted extensive experiments on a trace dataset, which demonstrate that our approach achieves much higher accuracy and greatly faster speed than state-of-the-art approaches.

2 Related Works

Existing works for trust prediction in online social networks can be categorized into several classes: propagation based trust prediction, matrix factorization and matrix recovery trust, K-NN based methods, as well as other ad hoc methods.

Propagation Based Trust Prediction. Considering all paths from the source user to the target user, Hang et al. [8] predicted trust by calculating propagated trust through concatenation, aggregation, and selection operations. The multiplication propagating strategy has been exploited in [15, 24], which predicts trust between two users based on the product of trust values on all links of a path. Moveover, Li et al. [15] discussed the feasibility of this strategy based on Baysian Inference. On the other hand, the average propagating strategy has been utilized in [17] for trust propagation, which predicts trustworthiness based on a weighted average of trust values on all links of a path between two users. Furthermore, min/max propagating strategy predicts trust between two users based on the minimal or maximal trust values on all links of a path, which is leveraged in [2, 22], and [22] improves the efficiency of trust propagation by preprocessing. Instead of considering all possible paths, the trust prediction problem is transformed into the restricted k-optimal path problem [11, 16, 17], by establishing constraint on propagated trust path. Besides considering propagated trust on paths, path variance and path length are introduced to adjust the propagated trust [7].

Matrix Factorization and Matrix Recovery Trust. Matrix factorization can explain the trust ratings by featuring both trustors and trustees on several latent factors inferred from rating values. A trust prediction model was presented based on rating matrix factorization, which integrates both propagated trust and other related information to improve the result [23]. Another work proposed similarity regularization term to constrain matrix factorization [28]. Zheng et al. utilized matrix factorization in trust prediction in different context [27]. Huang et al. introduced joint factorization with the integration of user-item rating matrix and trust rating matrix [10]. Additionally, Matrix recovery was also exploited to predict unknown value in trust matrix [9].

K -NN Based Methods. According to the similarity of user ratings to films or commodities, the trustworthiness of a user to others can be represented by the weighted average of trust propensity of other K users who are the most similar to this user [5]. Another work integrated the authority similarity, interest similarity, and rating similarity to calculate the K-NN for trust prediction [25]. The main difference between iSim and the existing K-NN approaches is that iSim does not need additional information other than the trust ratings for trust prediction, while the others require much extra information.

Other Approaches. Trust sub-network extraction can improve the accuracy of trust prediction, and it has been studied with binary ant colony algorithm [26]. Trust balance theory can also be employed for trust prediction and adjustment [18]. The network flow from the source to the target can be transformed into the trust between them if we convert trust network to flow network [12]. Orman developed a Bayesian formulation, and a Bayesian Network analysis was employed to compute the unknown trust values [20].

3 Problem Statement

In most cases, trust rating value is defined as the belief one user has in the other. These values can be organized to a square matrix T with size n, which we call an n-user Trust Rating Matrix. We use $T_{i,j}$ to indicate the trust rating value a user i gives to user j. We call the user who gives trust ratings to others as trustor, and the user who receives trust ratings as trustee. To simplify our calculation, we assume that one user gives trust rating value to at least one other user. We denote our predicted trust rating value matrix as $\widetilde{T}$, whose value is generated by our prediction approach iSim. Definition 1 presents the basic problem statement.

Definition 1

(Trust Prediction Problem). Given a trust rating matrix T, whose non-zero elements denote the trust rating values from the row users to the column users. Predict the most proper values for zero elements and fill T as $\widetilde{T}$.

4 iSim: A Novel Trust Prediction Approach

Now we formally introduce our integrated similarity based trust prediction approach iSim, including its motivation, configuration, and detailed description.

Motivation: In a large-scale social network, some users may share, to some degree, similar kind of standards or preference when giving trust rating values. For example, as showed in Fig. 1, suppose there are 4 trusters $i_1, i_2, i_3, i_4$ and many trustees $j_k$($k=1,...,n$) in the network. Truster $i_1$, $i_2$ and $i_3$ have similar behaviors on putting trust ratings on trustee $j_k$ while $i_4$ does not, that is, $i_1$, $i_2$, $i_3$ share high behavior similarity with each other and low with $i_4$. Different from $i_3$, $i_2$ and $i_4$ is given a trust rating directly by $i_1$, which suggests that $i_1$ is closely connected with $i_2$ and $i_4$. Comprehensively considering both factors of behavior similarity and propagation, we can finally speculate that $i_1$ and $i_2$ share higher iSim similarity when rating others, which will greatly assist the prediction of the unknown values from $i_1$ to $j_2$ in T.

Work Flow: Figure 2 shows the flow path of iSim, which contains three main parts: iSim-VSS, iSim-LFS and iSim-PT. We first calculate the three parts independently based on the input trust rating matrix T, and then integrate them appropriately to get the similarity values between users. Finally, we generate an output predicted trust rating matrix $\widetilde{T}$ from these similarities based on K-NN collaborating filtering method.

In detail, for iSim-VSS, we use row vectors in T to describe a user’s preference and find similarity between these preferences. Because this similarity is extracted directly from T, we call it explicit similarity. For iSim-LFS, we describe users’ trust rating styles by the row vectors of specific matrix factorized from T, which reflect a promising similarity. Since this similarity is lurked in T, we call it implicit similarity. For iSim-PT, we depict the trust relationship between users via the weight-sums along the paths linking the two users in the social network generated by T. We then integrate the above three parts and adopt K-NN method to pick up top-K users who share the highest similarities with a particular user i. Finally, we will comprehensively consider the rating values these K users give to user j (if any) to predict the trust value user i will give to user j, which leads to our final predicted trust rating matrix $\widetilde{T}$.

iSim-VSS: The similarity between two users could be described by the rating values they give to their common trustees [16]. This kind of similarity, known as vector space similarity (VSS), shows how much the two users are in common on their rating preferences. Correspondingly, $VSS_{i,j}$, which is the VSS between user i and user j, can be defined as Eq. (1).

$$\begin{aligned} VSS_{i,j}=\frac{\sum \limits _{k \in From_i \cap From_j}T_{i,k} \cdot T_{j,k}}{\sqrt{\sum \limits _{k \in From_i \cap From_j}T_{i,k}^2}\cdot {\sqrt{\sum \limits _{k \in From_i \cap From_j}T_{j,k}^2}}} \end{aligned}$$

(1)

In Eq. (1), $From_i$ is the set of users that i trusts, and k is the user trusted by both i and j. Note that whenever we use terms like $VSS_{i,j}$ to represent the corresponding value related to users i and j, we will use VSS to represent a matrix storing such type of values by default. However, computing Eq. (1) is extremely time consuming, so we introduce inverted index to reduce the time complexity [14]. Algorithm 1 gives out the details. Here $To_k$ is the set of users who give a trust rating value to user k, and we also call it the inverted index of i. Through inverted index, when calculating the similarity between user i and user j, we can avoid enumerating the users to whom user i or j does not give trust rating value, which can bring significant improvement on time efficiency.

iSim-LFS: Sparsity problem has a fatal influence on the calculation of VSS. As a result, VSS alone cannot satisfy our demand when determining the similarity between two users. Therefore, we propose latent factor similarity (LFS) to overcome the sparsity problem, which describes a user’s rating style. Actually, when users give trust rating values, they consider several factors. For example if i is familiar with j, it is possible that i puts a higher trust rating value on j. This is called familiarity factor. Another example is that some group may have a tendency to give out average higher trust values while other groups may not, which is named as rating tendency factor. Such factors are called “latent factors” and we denote $\ell $ as the number of latent factors. Thus, user i’s rating style can be represented as a vector of size $\ell $, named as trustor-specific vector $\varvec{u_i}$, and these $\varvec{u_i}$’s could form an $n \times \ell $ trustor-specific matrix U. Similar to iSim-VSS, we define latent factor similarity between users as the cosine value of their trust-specific vectors.

We can factorize our trust rating matrix T into trustor-specific matrix U and trustee-specific matrix V [28] according to matrix factorization method. To cooperate with $\varvec{u_i}$, we denote the $j^{th}$ row vector in V as $\varvec{v_j}$, which makes the inner product of $\varvec{u_i}$ and $\varvec{v_j}$ an approximation of trust rating value $T_{i,j}$. In order to get U and V, we need to minimize the sum-of-squared-error objective function with quadratic regularization terms [19]. In this case, we introduce an indicator matrix I to indicate whether there is a value in the corresponding position in T. Additionally, in order to avoid overfitting, two regularization terms from zero-mean spherical Gaussian priors are taken into consideration. Thus, we formulate our factorization function as Eq. (2):

$$\begin{aligned} \mathcal {L} (T,U,V)=\min _{U,V} \frac{1}{2} \sum \limits _{i=1}^n \sum \limits _{j=1}^n I_{i,j}\cdot ||T_{i,j}-\varvec{u_i}\cdot \varvec{v_j'}||_F^2 +\frac{\lambda _1}{2}||U||_F^2+\frac{\lambda _2}{2}||V||_F^2 \end{aligned}$$

(2)

where $||.||_F$ refers to Frobenius norm while $\lambda _1$ and $\lambda _2$ are coefficients controlling the extent of Gaussian priors. $I_{i,j}$ equals to 1 iff. $T_{i,j}$ exists, otherwise it is 0. We use gradient descent method to minimize the value of $\mathcal {L}(T,U,V)$ [19] by Eqs. (3) and (4). Algorithm 2 demonstrates the details to compute iSim-LFS.

$$\begin{aligned} \frac{\partial \mathcal {L}}{\partial U} = -\sum \limits _{i=1}^n \sum \limits _{j=1}^n I_{i,j}\cdot ||T_{i,j}-\varvec{u_i}\cdot \varvec{v_j'}||_F\cdot \varvec{v_j} +\sum \limits _{i=1}^n \lambda _1 \varvec{u_i} \end{aligned}$$

(3)

$$\begin{aligned} \frac{\partial \mathcal {L}}{\partial V} = -\sum \limits _{i=1}^n \sum \limits _{j=1}^n I_{i,j}\cdot ||T_{i,j}-\varvec{u_i}\cdot \varvec{v_j'}||_F\cdot \varvec{u_i} +\sum \limits _{j=1}^n \lambda _2 \varvec{v_j} \end{aligned}$$

(4)

iSim-PT: iSim-VSS and iSim-LFS are inspired by collaborative filtering. To make iSim specifically crafted for trust predicting problem, we consider trust propagation. It has been proved that in a real social network, users tend to trust a stranger if this person is a friend’s friend [6], so propagated trust strategy predicts the unknown trust by calculating the propagated trust [2, 11, 22]. iSim-PT sets the base value of similarity of users based on propagated trust, because there is a correlation between preference similarity and trust [1, 4].

Here, we use the min-max strategy [22] to calculate propagated trust matrix PT, which consists of propagated trust between any two users. Assume there are m paths from user i to user j. In one particular path, we pick up the maximal weight of all links in the path as the propagated trust of the path,and among these m paths, we pick up the path with minimal propagated trust, and that will be the value of $PT_{i,j}$.

However, there is a problem that only a few paths exist between two users in the directed graph. Thus, we transform the directed graph into an undirected one. For two users i and j, we set the weight of the undirected edge as the average of two weights of directed edge between them. It can be proved by the Cut Property [3] that $PT_{i,j}$ has the same value in the original undirected graph as that in the minimal spanning tree of the graph. Therefore, we reform Kruskal algorithm [13] in order to find out the matrix PT. Algorithm 3 gives out the detailed description of our method.

Trust Prediction: Comprehensively considering the effects of these factors, $VSS_{i,j}$ and $LFS_{i,j}$ respectively describes the explicit and implicit similarity between two users, while $PT_{i,j}$ pays attention to their trust relationship between them. Combing the three factors together, we define hybrid similarity for the first time by Eq. (5). Equation (5) multiplies the linear combination of explicit similarity and implicit similarity by similarity baseline derived from propagated trust. To measure similarity baseline, we explore the relationship among similarity, intimacy and trust. There is a linear relationship between similarity and intimacy among people [1], and the more intimate the relationship is, the harder people can strengthen trust by making the relationship more intimate (marginal effect) [4]. Therefore, trust can be treated as a logarithmic function on similarity, which leads to the similarity baseline in Eq. (5).

$$\begin{aligned} Sim_{i,j} = (\alpha _1\cdot VSS_{i,j} + \alpha _2\cdot LFS_{i,j}) \cdot e^{PT_{i,j}-1} \end{aligned}$$

(5)

$\alpha _1$ and $\alpha _2$ are two coefficients used to balance the explicit similarity and implicit similarity, and $\alpha _1 + \alpha _2 = 1$.

So far, with the aforementioned definitions and concepts, we can predict the trust from the trustor to the trustee based on the set $\varvec{P}$ generated by iSim. Let $\varvec{P_i}$ be the set of K-NN of user i. Equation (6) is our formula of trust prediction.

$$\begin{aligned} \widetilde{T}_{i,j} =\left( \sum \limits _{k \in \varvec{P_i}}Sim_{i,j}\cdot T_{k, j}\right) / \sum \limits _{k \in \varvec{P_i}} Sim_{i,j} \end{aligned}$$

(6)

5 Theoretical Analysis of iSim

In this section, we will analyze the time complexity of iSim. Our approach can be divided into two parts: the pre-training process and the predicting process. In the pre-training process, iSim calculates the similarities and finds the top-K similar users. In the predicting process, iSim predicts the trust rating values based on the similarity.

We first introduce three lemmas. Note that n is the number of users, t is the number of non-zero values in T, m is the number of iteration of matrix factorization, $\ell $ is the number of latent factors, K is the numbers of the similar users selected, and $\mathcal {A}^{-1}$ is the inverse of Ackermann’s function [21].

Lemma 1

The time complexity to compute matrix VSS is $\mathcal {O}(n^2+\frac{t^2}{n})$.

Proof

The time complexity is $\mathcal {O}(t)$ in Step 1, and it is $\mathcal {O}(n^2)$ in Step 2. iSim-VSS runs from Step 4 to Step 4 for k times, where . By the assumption that the data is randomly distributed, the time complexity is $\mathcal {O}(\frac{t^2}{n})$. Lastly, the time complexity from Step 8 to Step 8 is $\mathcal {O}(n^2)$. Thus, the total time complexity is $\mathcal {O}(n^2+\frac{t^2}{n})$. $\square $

Lemma 2

The time complexity to compute matrix LFS is $\mathcal {O}(n^2{\ell }+m{\ell }(n\,+\,t))$.

Proof

The time complexity is $\mathcal {O}(n\ell )$ in Step 2, where $\ell $ is the number of latent factors. The algorithm runs from Step 3 to Step 3 with the complexity of $\mathcal {O}(m\ell (n+t))$, where m is the number of iterations. The complexity from Step 6 to Step 6 is $\mathcal {O}(n^2{\ell })$. Thus, the total time complexity is $\mathcal {O}({\ell }n^2+m\ell (n+t))$. $\square $

Lemma 3

The complexity to compute matrix PT is $\mathcal {O}(t\log {t}+t\mathcal {A}^{-1}(n,t)+n^2)$.

Proof

The complexity is $\mathcal {O}(n^2)$ in Step 1, $\mathcal {O}(n)$ in Step 2, and $\mathcal {O}(t)$ in Step 3. Step 4 enumerates every record in ascending order. It determines the ascending order in $\mathcal {O}(t\log {t})$, and enumerates t times. The operators in Step 5 and Step 10 can be executed by Disjoint Sets [21] in $\mathcal {O}(\mathcal {A}^{-1}(n, t))$, where $\mathcal {A}^{-1}$ is the inverse of Ackermann’s function. It is hard to determine directly how many times the code in Step 9 would execute, but obviously every $PT_{i,j}$ is only calculated once. Thus, the time complexity from Step 4 to Step 4 is $\mathcal {O}(t\log {t}+t\mathcal {A}^{-1}(n,t)+n^2)$. Thus, the total time complexity is $\mathcal {O}(t\log {t}+t\mathcal {A}^{-1}(n,t)+t+n^2)$. $\square $

Theorem 1

The time complexity of iSim is $\mathcal {O}(t\log {t}+t\mathcal {A}^{-1}(n,t)+n^2{\ell }+m{\ell }(n+t)+n^2K)$.

Proof

Based on the the Lemmas 1, 2, and 3, the time complexity of pre-training process of iSim is $\mathcal {O}(t\log {t}+t\mathcal {A}^{-1}(n,t)+{\ell }n^2+m{\ell }(n+t))$. The time complexity to predict the trust between two users is $\mathcal {O}(K)$. Thus, the time complexity of iSim to predict the trust of all pairs of users is $\mathcal {O}(t\log {t}+t\mathcal {A}^{-1}(n,t)+{\ell }n^2+m{\ell }(n+t)+n^2K)$. $\square $

6 Experimental Evaluation

In this section, we present the experimental results of iSim, and analyze the effectiveness and efficiency. All the experiments are based on a real-world dataset. All of the experiments are implemented using Matlab R2014b running on an PC with Intel Core i5, 2.6 GHz CPU, 4 GB RAM and Ubuntu 14.04 operating system.

6.1 Experiment Setup

Dataset Description. The dataset used in our experiment is advotago ^{Footnote 1}. advotago is a trust-based social network for open source software development. To allow users to certify others, The social network provides 4 levels of trust ratings between users, which are “Observer”, “Apprentice”, “Journeyer”, and “Master”. The “Observer” is the lowest trust level, and the “Master” is the highest trust level. To quantify the dataset for calculation, we convert “Observer”, “Apprentice”, “Journeyer”, and “Master” to 0.1, 0.4, 0.7, and 1.

We use the statistics that contains 7, 425 users and 56, 550 trust rating records. Each record contains the trust rating values that a user gives to another user. To support repeated experiments, we conduct random sampling for 50 times. In each sampling, $80\,\%$ of records are selected as training set, and $20\,\%$ are selected as testing set. iSim may fail to predict trust values on rare occasions because of data sparsity of advotago. In that case, we use the average trust value of dataset as the default predicting value.

Metric Measures. In our experiments, Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are used to measure the predicting effectiveness, which are common metrics in the area of trust predicting [23, 28]. The formulas of them are

$$\begin{aligned} MAE = \tfrac{\sum \limits _{i=1}^n\sum \limits _{j=1}^n{I_{i,j}\vert T_{i,j}-\widetilde{T}_{i,j} \vert }}{\sum \limits _{i=1}^n\sum \limits _{j=1}^n{I_{i,j}}}, RMSE = \sqrt{\tfrac{\sum \limits _{i=1}^n\sum \limits _{j=1}^n {I_{i,j}(T_{i,j}-\widetilde{T}_{i,j})^2}}{\sum \limits _{i=1}^n\sum \limits _{j=1}^n{I_{i,j}}}} \end{aligned}$$

In the formula, $T_{i,j}$ denotes the known trust rating from user i to user j, $\widetilde{T}_{i,j}$ denotes the predicted trust value from user i to user j. MAE is the average value of each absolute error. RMSE is the variance of each absolute error. Compared with MAE, RMSE is sensitive to the large absolute error. The predicting accuracy is higher if MAE and RMSE is lower.

To measure the efficiency of iSim, we consider the execution time of the program. The unit of execution time is the second.

Competitive Methods. In order to evaluate the effectiveness and efficiency of iSim, we select two approaches (SCMF) [28] and (PROP) [7] for comparison. The SCMF is the most efficient trust predicting algorithm based on matrix factorization, which considers the trust rating value similarity and trust rating distribution similarity between users to constrain the process of matrix factorization and result in high predicting accuracy. The PROP predicts trust rating from user i to user j by finding a path with high propagated trust. The propagated trust consists of both mean and variance of trust of the edges on the path and the length of path.

6.2 Experiments for Parameter Settings

The selection of K in the K-NN query is very important. Figure 3(a) shows the influence of K in iSim. In this figure, we set $\alpha _1=\alpha _2=0.5$, and try to set K from 10 to 50 in ascending order with the step of 5. The result shows MAE and RMSE are almost constant, so the accuracy of our approach is not sensitive to K. Thus, we set $K = 20$.

The $\alpha _1$ and $\alpha _2$ determine the weight of two similarities, which influence the predicting accuracy. We try to test the influence of $\alpha _1$ and $\alpha _2$ for iSim, and the results are presented in Fig. 3(b). In this figure, $K = 20$, $\alpha _1$ is set from 0 to 1 with the step of 0.1, and $\alpha _2=1-\alpha _1$, respectively. It shows when $\alpha _1$ increases from 0.2 to 1 or decreases from 0.2 to 0, the accuracy decreases. Thus, we select $\alpha _1=0.2$ and $\alpha _2=0.8$. Obviously, in the dataset, since the dataset is sparse, latent factor similarity is more important. However, it is hard to know the weight in the reality, so we also set $\alpha _1=0.5$ and $\alpha _2=0.5$ in the experiment.

The $\ell $, $\lambda _1$ and $\lambda _2$ are the parameters in the matrix factorization. $\ell $ is the number of latent factors. A large $\ell $ can increase the accuracy of predicting but slow down the speed of prediction. $\lambda _1$ and $\lambda _2$ control the extent of Gaussian priors. We set the same values with SCMF: $\ell =10$, $\lambda _1 = \lambda _2 = 0.01$. The parameter setting of SCMF and PROP are the same as the corresponding paper.

6.3 Evaluation Analysis

Effectiveness Analysis. We compare the predicting accuracy with SCMF and PROP. The results are shown in Fig. 4. In the figure, $\alpha _1$ is set as 0.2. We can see that iSim outperforms other two approaches. Compared with PROP, the MAE decreases by $29.6\,\%$ and the RMSE decreases by $30.5\,\%$. Compared with SCMF, the MAE decreases by $18.8\,\%$ and the RMSE deceases by $11.9\,\%$. These results show that iSim achieves significant improvement, since SCMF is the state-of-art literature with excellent performance.

Specially, we analyze the performance gain from different components. Table 1 shows the results of predicting accuracy of the approaches that adds iSim-VSS, iSim-PT, and iSim-LFS progressively. In the table, “VSS” only uses iSim-VSS, and “VSS+PT” uses iSim-VSS and iSim-PT, (set $\alpha _1$ as 1). “VSS+PT+LFS” denotes iSim with $\alpha _1 = 0.2$. “norm VSS+PT+LFS” denotes iSim with $\alpha _1 = 0.5$. We compare all of them with SCMF, which is the best competitor.

Table 1. Accuracy gain analysis

Full size table

Now we analyze the results in Table 1. Vector space similarity is a common metric of similarity, so we treat it as a basic form. In the result of the basic form, the accuracy is improved by $10.2\,\%$ in terms of MAE and improved by $5.4\,\%$ in terms of RMSE, if compared with the best competitor. If utilizing iSim-VSS and iSim-PT, the accuracy is improved by $13.8\,\%$ in terms of MAE and improved by $7.8\,\%$ in terms of RMSE. If utilizing iSim-VSS, iSim-PT, and iSim-LFS with $\alpha _1=0.5$, the accuracy is improved by $18.1\,\%$ in terms of MAE and improved by $11.3\,\%$ in terms of RMSE. If utilizing iSim-VSS, iSim-PT, and iSim-LFS with $\alpha _1=0.2$, the accuracy is improved by $18.8\,\%$ in terms of MAE and improved by $11.9\,\%$ in terms of RMSE. The results confirm that the three methods of measuring similarity are efficient.

Efficiency Analysis. In fact, some of the trust predicting approaches can be divided into pre-training process and predicting process, e.g., the SCMF. The pre-training process mines the information of datasets which can improve the effectiveness or efficiency of predicting, and then saves it in the small storage space. Thus, we compare the approaches in terms of the two process.

We first give the time complexity of the three approaches. In the pre-training process, the time complexity of iSim is $\mathcal {O}(t\log {t}+t\mathcal {A}^{-1}(n,t)+{\ell }n^2+m{\ell }(n+t))$, while that of SCMF is $\mathcal {O}(mn^2\ell )$, which is higher than that of iSim. The PROP does not have pre-training process. In the predicting process, the time complexity of iSim to predict the trust of one pair of users is $\mathcal {O}(K)$, and that of SCMF is also p$\mathcal {O}(\ell )$. However, that of PROP is $\mathcal {O}((\frac{t}{n})^p)$, which is much higher. Note that p is the maximum length of paths from the two corresponding users. In all, combining the two process, iSim has the best time bound.

Now we consider the execution time of pre-training process. The PROP does not have pre-training process, so we compared the execution time between SCMF with our approach with different components. Figure 5(a) shows the comparisons of execution time. The labels in the figure are same as the previous subsection. Note that the setting of $\alpha _1$ and $\alpha _2$ does not affect the execution time. We can see that the pre-training time of iSim is only approximately $6.7\,\%$ of SCMF. If we do not use iSim-LFS, the execution time ratio of “VSS+PT” to SCMF is only approximately $2.8\,\%$. If we do not consider iSim-PT neither, the ratio is only about $1.8\,\%$. The pre-training executing efficiency is far superior to SCMF, because the constrain component of SCMF makes its time complexity much higher than simple matrix factorization.

Then we consider the execution time of predicting process. Note that the predicting execution time is the same regardless of the component of iSim. Figure 5(b) shows the comparison of execution time. in which log scale is used for the Y axis. We can see the predicting time of iSim is approximately same as that of SCMF, but it is merely about $0.012\,\%$ of that of PROP.

Moreover, we compare the total execution time of three approaches to predict the trust of the whole testing datasets. Figure 5(c) shows the results. We can see that the total execution time of iSim is only about $6.8\,\%$ of SCMF, and only about $3.5\,\%$ of PROP. Specially, if we only use VSS and PT to measure similarity, the total execution time of “VSS+PT” is merely about $2.8\,\%$ of SCMF and about $1.4\,\%$ of PROP. If we exclusively use iSim-VSS to measure similarity, the total execution time is merely $1.8\,\%$ of SCMF and $0.9\,\%$ of PROP. Obviously, iSim is superior to other two approaches in terms of execution time.

7 Conclusion

In this paper, we have proposed a trust predicting approach named iSim: an integrated similarity based collaborative filtering approach for trust prediction in service-oriented social networks. iSim measures the similarity from three respects: vector space similarity, latent factor similarity and propagated trust. It novelly integrates three strategies and makes use of their complementarity. We provide both theoretical analysis for time bound and numerical experiments with real-world dataset to validate the priority of iSim. Moveover, we innovatively leverage latent factor vector in the study of similarity, which might be an inspiration in other fields.

Notes

1.
http://www.trustlet.org/datasets/advogato/.

References

Brehm, S.S.: Intimate Relationships. Mcgraw-Hill Book Company, New York (1992)
Google Scholar
Chen, K., Liu, G., Shen, H., Qi, F.: Sociallink: utilizing social network and transaction links for effective trust management in P2P file sharing systems. In: IEEE International Conference on Peer-to-Peer Computing (P2P), pp. 1–10 (2015)
Google Scholar
Dasgupta, S., Papadimitriou, C., Vazirani, U.: Algorithms. McGraw-Hill, New York (2006)
Google Scholar
Dimitrova-Grajzl, V., Grajzl, P., Guse, A.J.: Trust, perceptions of corruption, and demand for regulation: evidence from post-socialist countries. J. Socio-Econ. 41(3), 292–303 (2012)
Article Google Scholar
Ghodousi, E., Hamzeh, A.: A new approach for trust prediction by using collaborative filtering based of pareto dominance in social networks. Ciencia Natura 37, 95–101 (2015)
Google Scholar
Guha, R., Kumar, R., Raghavan, P., Tomkins, A.: Propagation of trust and distrust. In: International Conference on World Wide Web (WWW), pp. 403–412 (2004)
Google Scholar
Hamdi, S., Bouzeghoub, A., Gancarski, A.L., Ben Yahia, S.: Trust inference computation for online social networks. In: IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 210–217 (2013)
Google Scholar
Hang, C.W., Wang, Y., Singh, M.P.: Operators for propagating trust and theirevaluation in social networks. In: International Conference on Autonomous Agents and Multiagent Systems (AAMAS), vol. 2, pp. 1025–1032 (2009)
Google Scholar
Huang, J., Nie, F., Huang, H., Lei, Y., Ding, C.H.: Social trust prediction using rank-k matrix recovery. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 2647–2653 (2013)
Google Scholar
Huang, J., Nie, F., Huang, H., Tu, Y.C.: Trust prediction via aggregating heterogeneous social networks. In: ACM International Conference on Information and Knowledge Management (CIKM), pp. 1774–1778 (2012)
Google Scholar
Jiang, W., Wang, G., Wu, J.: Generating trusted graphs for trust evaluation in online social networks. Future Gener. Comput. Syst. (FGCS) 31, 48–58 (2014)
Article Google Scholar
Jiang, W., Wu, J., Li, F., Wang, G., Zheng, H.: Trust evaluation in online social networks using generalized network flow. IEEE Trans. Comput. (TC) 65, 952–963 (2016)
Article MathSciNet Google Scholar
Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 7(1), 48–50 (1956)
Article MathSciNet MATH Google Scholar
Lee, D., Park, J., Shim, J., Lee, S.: An efficient similarity join algorithm with cosine similarity predicate. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010, Part II. LNCS, vol. 6262, pp. 422–436. Springer, Heidelberg (2010)
Chapter Google Scholar
Li, L., Wang, Y., Lim, E.-P.: Trust-oriented composite service selection and discovery. In: Barros, A., Grigori, D., Narendra, N.C., Dam, H.K. (eds.) ICSOC 2015. LNCS, vol. 9435, pp. 50–67. Springer, Heidelberg (2009). doi:10.1007/978-3-642-10383-4_4
Chapter Google Scholar
Liu, G., Liu, A., Wang, Y., Li, L.: An efficient multiple trust paths finding algorithm for trustworthy service provider selection in real-time online social network environments. In: IEEE International Conference on Web Services (ICWS), pp. 121–128 (2014)
Google Scholar
Liu, G., Wang, Y., Orgun, M.A., Lim, E.P.: Finding the optimal social trust path for the selection of trustworthy service providers in complex social networks. IEEE Trans. Serv. Comput. (TSC) 6(2), 152–167 (2013)
Article Google Scholar
Ma, Y., Lu, H., Gan, Z., Ma, X.: Trust discounting and trust fusion in online social networks. In: Chen, L., Jia, Y., Sellis, T., Liu, G. (eds.) APWeb 2014. LNCS, vol. 8709, pp. 619–626. Springer, Heidelberg (2014)
Google Scholar
Mnih, A., Salakhutdinov, R.R.: Probabilistic matrix factorization. In: Conference on Neural Information Processing Systems (NIPS), pp. 1257–1264 (2008)
Google Scholar
Orman, L.V.: Bayesian inference in trust networks. ACM Trans. Manag. Inf. Syst. (TMIS) 4, 48–68 (2013)
Google Scholar
Tarjan, R.E., Van Leeuwen, J.: Worst-case analysis of set union algorithms. J. ACM (JACM) 31(2), 245–281 (1984)
Article MathSciNet MATH Google Scholar
Xu, Y., Liu, J., Tang, M., Liu, X.: An efficient trust propagation scheme for predicting trustworthiness of service providers in service-oriented social networks. In: IEEE International Conference on Web Services (ICWS), pp. 467–474 (2013)
Google Scholar
Yao, Y., Tong, H., Yan, X., Xu, F., Lu, J.: MATRI: a multi-aspect and transitive trust inference model. In: International Conference on World Wide Web (WWW), pp. 1467–1476 (2013)
Google Scholar
Zhang, Y., Chen, H., Wu, Z.: A social network-based trust model for the semantic web. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., García Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 183–192. Springer, Heidelberg (2006). doi:10.1007/11839569_18
Chapter Google Scholar
Zhao, Q., Zuo, W., Tian, Z., Wang, X., Wang, Y.: Predicting trust relationships in social networks based on WKNN. J. Softw. 10(1), 71–81 (2015)
Article Google Scholar
Zheng, X., Wang, Y., Orgun, M.A.: BiNet: trust sub-network extraction using binary ant colony algorithm in contextual social networks. In: International Conference on Web Services (ICWS), pp. 321–328 (2015)
Google Scholar
Zheng, X., Wang, Y., Orgun, M.A., Liu, G., Zhang, H.: Social context-aware trust prediction in social networks. In: Barros, A., Grigori, D., Narendra, N.C., Dam, H.K. (eds.) ICSOC 2015. LNCS, vol. 9435, pp. 527–534. Springer, Heidelberg (2014). doi:10.1007/978-3-662-45391-9_45
Chapter Google Scholar
Zheng, X., Wang, Y., Orgun, M.A., Zhong, Y., Liu, G.: Trust prediction with propagation and similarity regularization. In: AAAI Conference on Artificial Intelligence, pp. 237–243 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Key Laboratory of Data Science, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Mingding Liao, Xiao Liu, Xiaofeng Gao & Guihai Chen
Department of Computer Science, California State University East Bay, Hayward, USA
Jiaofei Zhong

Authors

Mingding Liao
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Gao
View author publications
You can also search for this author in PubMed Google Scholar
Jiaofei Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Guihai Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaofeng Gao .

Editor information

Editors and Affiliations

The University of Adelaide, Adelaide, South Australia, Australia
Quan Z. Sheng
Department of Computer Science, University of Alberta, Edmonton, Alberta, Canada
Eleni Stroulia
Telecom SudParis, University of Paris-Saclay, Evry cedex, France
Samir Tata
ISIMM, University of Monastir, Monastir , Tunisia
Sami Bhiri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liao, M., Liu, X., Gao, X., Zhong, J., Chen, G. (2016). iSim: An Efficient Integrated Similarity Based Collaborative Filtering Approach for Trust Prediction in Service-Oriented Social Networks. In: Sheng, Q., Stroulia, E., Tata, S., Bhiri, S. (eds) Service-Oriented Computing. ICSOC 2016. Lecture Notes in Computer Science(), vol 9936. Springer, Cham. https://doi.org/10.1007/978-3-319-46295-0_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-46295-0_31
Published: 20 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46294-3
Online ISBN: 978-3-319-46295-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

iSim: An Efficient Integrated Similarity Based Collaborative Filtering Approach for Trust Prediction in Service-Oriented Social Networks

Abstract

Similar content being viewed by others

Stochastic trust network enriched by similarity relations to enhance trust-aware recommendations

Generating Recommendations via Trust-Aware Recommendation System by the Topological Impact of Users in Social Trust Networks

Recommendation algorithm of influence and trust relationship

Keywords

1 Introduction

2 Related Works