An efficient and effective hop-based approach for influence maximization in social networks

Tang, Jing; Tang, Xueyan; Yuan, Junsong

doi:10.1007/s13278-018-0489-y

An efficient and effective hop-based approach for influence maximization in social networks

Original Article
Published: 10 February 2018

Volume 8, article number 10, (2018)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

1028 Accesses
39 Citations
Explore all metrics

Abstract

Influence maximization in social networks is a classic and extensively studied problem that targets at selecting a set of initial seed nodes to spread the influence as widely as possible. However, it remains an open challenge to design fast and accurate algorithms to find solutions in large-scale social networks. Prior Monte Carlo simulation-based methods are slow and not scalable, while other heuristic algorithms do not have any theoretical guarantee and they have been shown to produce poor solutions for quite some cases. In this paper, we propose hop-based algorithms that can be easily applied to billion-scale networks under the commonly used Independent Cascade and Linear Threshold influence diffusion models. Moreover, we provide provable data-dependent approximation guarantees for our proposed hop-based approaches. Experimental evaluations with real social network datasets demonstrate the efficiency and effectiveness of our algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Linear Time Algorithm for Influence Maximization in Large-Scale Social Networks

Probability-Based Multi-hop Diffusion Method for Influence Maximization in Social Networks

Article 04 January 2017

Multi-hop analysis method for rich-club phenomenon of influence maximization in social networks

Article 04 November 2021

References

Arora A, Galhotra S, Ranu S (2017) Debunking the myths of influence maximization: an in-depth benchmarking study. In: Proceedings of ACM SIGMOD, pp 651–666
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Article MathSciNet Google Scholar
Borgs C, Brautbar M, Chayes J, Lucier B (2014) Maximizing social influence in nearly optimal time. In: Proceedings of SODA, pp 946–957
Cha M, Mislove A, Gummadi KP (2009) A measurement-driven analysis of information propagation in the Flickr social network. In: Proceedings WWW, pp 721–730
Chen W (2009) NetHEPT dataset. http://research.microsoft.com/en-us/people/weic/
Cheng S, Shen H, Huang J, Chen W, Cheng X (2014) IMRank: influence maximization via finding self-consistent ranking. In: Proceedings ACM SIGIR, pp 475–484
Cheng S, Shen H, Huang J, Zhang G, Cheng X (2013) Staticgreedy: solving the scalability-accuracy dilemma in influence maximization. In: Proceedings ACM CIKM, pp 509–518
Chen W, Lu W, Zhang N (2012) Time-critical influence maximization in social networks with time-delayed diffusion process. In: Proceedings of AAAI, pp 592–598
Chen W, Wang C, Wang Y (2010a) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of ACM KDD, pp 1029–1038
Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: Proceedings of ACM KDD, pp 199–208
Chen W, Yuan Y, Zhang L (2010b) Scalable influence maximization in social networks under the linear threshold model. In: Proceedings of IEEE ICDM, pp. 88–97
Cohen E, Delling D, Pajor T, Werneck RF (2014) Sketch-based influence maximization and computation: scaling up with guarantees. In: Proceedings ACM CIKM, pp 629–638
Conforti M, Cornuéjols G (1984) Submodular set functions, matroids and the greedy algorithm: tight worst-case bounds and some generalizations of the rado-edmonds theorem. Discrete Appl Math 7(3):251–274
Article MathSciNet Google Scholar
Dinh TN, Zhang H, Nguyen DT, Thai MT (2014) Cost-effective viral marketing for time-critical campaigns in large-scale social networks. IEEE ACM Trans Netw 22(6):2001–2011
Article Google Scholar
Domingos P, Richardson M (2001) Mining the network value of customers. In: Proceedings ACM KDD, pp 57–66
Galhotra S, Arora A, Roy S (2016) Holistic influence maximization: Combining scalability and efficiency with opinion-aware models. In: Proceedings ACM SIGMOD, pp 743–758
Goel S, Watts DJ, Goldstein DG (2012) The structure of online diffusion networks. In: Proceedings ACM EC, pp 623–638
Goyal A, Bonchi F, Lakshmanan LVS (2011a) A data-based approach to social influence maximization. Proc VLDB Endow 5(1):73–84
Article Google Scholar
Goyal A, Bonchi F, Lakshmanan L, Venkatasubramanian S (2013) On minimizing budget and time in influence propagation over social networks. Social Netw Anal Min 3(2):179–192
Article Google Scholar
Goyal A, Lu W, Lakshmanan LV (2011b) Celf++: Optimizing the greedy algorithm for influence maximization in social networks. In: Proceedings WWW Companion, pp 47–48
Goyal A, Lu W, Lakshmanan LVS (2011c) Simpath: An efficient algorithm for influence maximization under the linear threshold model. In: Proceedings IEEE ICDM, pp 211–220
Jiang F, Jin S, Wu Y, Xu J (2014) A uniform framework for community detection via influence maximization in social networks. In: Proceedings IEEE/ACM ASONAM, pp 27–32
Jung K, Heo W, Chen W (2012) IRIE: scalable and robust influence maximization in social networks. In: Proceedings IEEE ICDM, pp 918–923
Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings ACM KDD, pp 137–146
Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of WWW, pp 591–600
Lee JR, Chung CW (2014) A fast approximation for influence maximization in large social networks. In: WWW Companion, pp 1157–1162
Leskovec J, Adamic LA, Huberman BA (2007a) The dynamics of viral marketing. ACM Trans Web 1(1):5:1–5:39
Article Google Scholar
Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance N (2007b) Cost-effective outbreak detection in networks. In: Proceedings of ACM KDD, pp 420–429
Leskovec J, Krevl A (2014) SNAP datasets: stanford large network dataset collection. http://snap.stanford.edu/data
Li Y, Zhao BQ, Lui JCS (2012) On modeling product advertisement in large-scale online social networks. IEEE ACM Trans Netw 20(5):1412–1425
Article Google Scholar
Lin Y, Chen W, Lui JC (2017) Boosting information spread: an algorithmic approach. In: Proceedings of IEEE ICDE, pp 883–894
Liu B, Cong G, Xu D, Zeng Y (2012) Time constrained influence maximization in social networks. In: Proceedings of IEEE ICDM, pp 439–448
Lu W, Chen W, Lakshmanan LV (2015) From competition to complementarity: comparative influence diffusion and maximization. Proc VLDB Endow 9(2):60–71
Article Google Scholar
Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions-I. Math Program 14(1):265–294
Article MathSciNet Google Scholar
Nguyen HT, Dinh TN, Thai MT (2016a) Cost-aware targeted viral marketing in billion-scale networks. In: Proceedings of IEEE INFOCOM
Nguyen HT, Thai MT, Dinh TN (2016b) Stop-and-stare: optimal sampling algorithms for viral marketing in billion-scale networks. In: Proceedings of ACM SIGMOD, pp 695–710
Ohsaka N, Akiba T, Yoshida Y, Kawarabayashi K (2014) Fast and accurate influence maximization on large networks with pruned Monte-Carlo simulations. In: Proceedings of AAAI, pp 138–144
Ohsaka N, Sonobe T, Fujita S, Kawarabayashi Ki (2017) Coarsening massive influence networks for scalable diffusion analysis. In: Proceedings of ACM SIGMOD, pp 635–650
Song G, Zhou X, Wang Y, Xie K (2015) Influence maximization on large-scale mobile social network: a divide-and-conquer method. IEEE Trans Parallel Distrib Syst 26(5):1379–1392
Article Google Scholar
Tang Y, Shi Y, Xiao X (2015) Influence maximization in near-linear time: A martingale approach. In: Proceedings of ACM SIGMOD, pp 1539–1554
Tang J, Tang X, Xiao X, Yuan J (2018a) Online processing algorithms for influence maximization. In: Proceedings of ACM SIGMOD
Tang J, Tang X, Yuan J (2016) Profit maximization for viral marketing in online social networks. In: Proceedings of IEEE ICNP, pp 1–10
Tang J, Tang X, Yuan J (2017a) Influence maximization meets efficiency and effectiveness: a hop-based approach. In: Proceedings of IEEE/ACM ASONAM, pp 64–71
Tang J, Tang X, Yuan J (2017b) Profit maximization for viral marketing in online social networks: algorithms and analysis. IEEE Trans Knowl Data Eng (Preprint)
Tang J, Tang X, Yuan J (2018b) Towards profit maximization for online social network providers. In: Proceedings of IEEE INFOCOM
Tang Y, Xiao X, Shi Y (2014) Influence maximization: Near-optimal time complexity meets practical efficiency. In: Proceedings of ACM SIGMOD, pp 75–86
Wang Z, Yang Y, Pei J, Chu L, Chen E (2017) Activity maximization by effective information diffusion in social networks. IEEE Trans Knowl Data Eng 29(11):2374–2387
Article Google Scholar
Xu W, Lu Z, Wu W, Chen Z (2014) A novel approach to online social influence maximization. Social Netw Anal Min 4(1):153
Article Google Scholar
Zhang C, Sun J, Wang K (2013) Information propagation in microblog networks. In: Proceedings of IEEE/ACM ASONAM, pp 190–196
Zhou C, Zhang P, Guo J, Guo L (2014) An upper bound based greedy algorithm for mining top-k influential nodes in social networks. In: Proceedings of WWW Companion, pp 421–422
Zhou C, Zhang P, Guo J, Zhu X, Guo L (2013) UBLF: an upper bound based approach to discover influential nodes in social networks. In: Proceedings of IEEE ICDM, pp 907–916

Download references

Acknowledgements

This research is supported by the National Research Foundation, Prime Minister’s Office, Singapore, under its IDM Futures Funding Initiative, and by Singapore Ministry of Education Academic Research Fund Tier 1 under Grant 2017-T1-002-024 and Tier 2 under Grant MOE2015-T2-2-114.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore
Jing Tang & Xueyan Tang
Computer Science and Engineering Department, State University of New York at Buffalo, Buffalo, 602000, New York, USA
Junsong Yuan

Authors

Jing Tang
View author publications
You can also search for this author inPubMed Google Scholar
Xueyan Tang
View author publications
You can also search for this author inPubMed Google Scholar
Junsong Yuan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jing Tang.

Appendix

Proof of Theorem 1

To consider the outgoing edges from u one at a time, we first disable all the edges from u to its neighbors except for one edge $\langle u,w_1\rangle$. Then, for each neighbor v of $w_1$, all of v’s inverse neighbors other than $w_1$ have their one-hop activation probabilities unchanged by adding $\langle u,w_1\rangle$. Let $\pi _2^{{S}\cup \{u\}}(v|w_1)$ denote the new two-hop activation probability of v. Then, we have

$$\begin{aligned} \frac{1-\pi _2^{{S}\cup \{u\}}(v|w_1)}{1-\pi _2^{{S}}(v)}=\rho (S,u,v,w_1), \end{aligned}$$

(16)

where $\rho (S,u,v,w)=\frac{1-p_{w,v}\cdot \pi _1^{{S}\cup \{u\}}(w)}{1-p_{w,v}\cdot \pi _1^{{S}}(w)}$. Next, we enable the second edge $\langle u,w_2\rangle$. Let $\pi _2^{{S}\cup \{u\}}(v|w_1,w_2)$ denote the new two-hop activation probability of v. Following similar arguments, for each neighbor v of $w_2$, we have

$$\begin{aligned} \frac{1-\pi _2^{{S}\cup \{u\}}(v|w_1,w_2)}{1-\pi _2^{{S}\cup \{u\}}(v|w_1)} =\rho (S,u,v,w_2). \end{aligned}$$

(17)

We continue to enable the outgoing edges of u sequentially. In general, when an edge $\langle u,w_i\rangle$ is enabled after edges $\langle u,w_1\rangle, \langle u,w_2\rangle, \ldots , \langle u,w_{i-1}\rangle$, for each neighbor v of $w_i$, we have

$$\begin{aligned} \frac{1-\pi _2^{{S}\cup \{u\}}(v|w_1,\dots ,w_i)}{1-\pi _2^{{S}\cup \{u\}}(v|w_1,\dots ,w_{i-1})}=\rho (S,u,v,w_i). \end{aligned}$$

(18)

Therefore, we can initialize $\pi _2^{{S}\cup \{u\}}(v)$ with $\pi _2^{{S}}(v)$ and iteratively update $\pi _2^{{S}\cup \{u\}}(v)$ with

$$\begin{aligned} 1-\left(1-\pi _2^{{S}\cup \{u\}}(v)\right)\cdot \rho (S,u,v,w), \end{aligned}$$

(19)

for all the nodes $w\in {N}_u\setminus {S}$ and $v\in {N}_w\setminus {S}$. Moreover, for the direct neighbors of u, their two-hop activation probabilities also need to be adjusted because u’s one-hop activation probability has changed from $\pi _1^{{S}}(u)$ to 1. For each neighbor v of u, the adjustment can be made in a similar way by updating $\pi _2^{{S}\cup \{u\}}(v)$ with

$$\begin{aligned} 1-\left(1-\pi _2^{{S}\cup \{u\}}(v)\right)\cdot \rho (S,u,v,u). \end{aligned}$$

(20)

Then, the final two-hop activation probability $\pi _2^{{S}\cup \{u\}}(v)$ by the iterative updates (19) and (20) is

$$\begin{aligned} \pi _2^{{S}\cup \{u\}}(v) = 1-\left(1-\pi _2^{{S}}(v)\right)\cdot \prod _{w\in ({M}_{u,v}\cup \{u\})}\rho (S,u,v,w). \end{aligned}$$

(21)

Hence, the theorem is proven. $\square$

Proof of Theorem 2

Consider a single seed $\{u\}$. Let ${A}_u\subseteq {N}_u$ denote a subset of a node u’s neighbors. Let $p({A}_u)$ denote the probability that all the nodes in ${A}_u$ are activated directly by u under the IC and LT models, while all the nodes in ${N}_u\setminus {A}_u$ are not directly activated by u (they may not even be activated eventually). Since each of u’s neighbors is activated by u independently, we have

$$\begin{aligned} p({A}_u)=\left(\prod _{v\in {A}_u} p_{u,v}\right)\cdot \left(\prod _{v\in {N}_u\setminus {A}_u}(1-p_{u,v})\right). \end{aligned}$$

(22)

Furthermore, with h hops of propagation, for each node $w\in {V}\setminus \{u\}$, w can only be activated by a propagation path starting from a node $v\in {A}_u$ whose path length is no longer than $h-1$ hops. In other words, the probability for w to be activated by ${A}_u$ is $\pi _{h-1}^{{A}_u}(w)$. Considering all the possible node sets ${A}_u$ activated directly by u, we have

$$\begin{aligned}&\sigma _h(\{u\})\nonumber \\&\quad = 1+\sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot \sum _{w\in {V}\setminus \{u\}}\pi _{h-1}^{{A}_u}(w)\right)\nonumber \\&\quad \le 1+\sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot \sum _{w\in {V}}\pi _{h-1}^{{A}_u}(w)\right)\nonumber \\&\quad = 1+\sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot \sigma _{h-1}({A}_u)\right)\nonumber \\&\quad \le 1+\sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot \sum _{v\in {A}_u}\sigma _{h-1}(\{v\})\right)\nonumber \\&\quad = 1+\sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot \sum _{v\in {N}_u}\left(\sigma _{h-1}(\{v\})\cdot p(v\in {A}_u)\right)\right)\nonumber \\&\quad = 1+\sum _{{A}_u\subseteq {N}_u}\left(\sum _{v\in {N}_u}\left(p({A}_u)\cdot \sigma _{h-1}(\{v\})\cdot p(v\in {A}_u)\right)\right)\nonumber \\&\quad = 1+\sum _{v\in {N}_u}\left(\sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot \sigma _{h-1}(\{v\})\cdot p(v\in {A}_u)\right)\right)\nonumber \\&\quad = 1+\sum _{v\in {N}_u}\left(\sigma _{h-1}(\{v\})\cdot \sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot p(v\in {A}_u)\right)\right). \end{aligned}$$

(23)

The second “$\le$” is due to the submodularity of $\sigma _{h}(\cdot )$ (see Theorem 3) such that $\sigma _{h-1}({A}_u)\le \sum _{v\in {A}_u}\sigma _{h-1}(\{v\})$. In the third “=”, $p(v\in {A}_u)$ is such a binary value that $p(v\in {A}_u)=1$ if and only if $v\in {A}_u$. Meanwhile, we have

$$\begin{aligned}&\sum _{{A}_u\subseteq {N}_u}\left(p({A}_u)\cdot p(v\in {A}_u)\right)\nonumber \\&\quad =\sum _{{A}_u\subseteq {N}_u\setminus \{v\}}\left(p({A}_u)\cdot p(v\in {A}_u)\right)\nonumber \\&\qquad +\sum _{{A}_u\subseteq {N}_u\setminus \{v\}}\left(p({A}_u\cup \{v\})\cdot p(v\in {A}_u\cup \{v\})\right)\nonumber \\&\quad =\sum _{{A}_u\subseteq {N}_u\setminus \{v\}}p({A}_u\cup \{v\}). \end{aligned}$$

(24)

The last “=” follows the fact that $p(v\in {A}_u)=0$ since $v\not \in {A}_u\subseteq {N}_u\setminus \{v\}$ and $p(v\in {A}_u\cup \{v\})=1$ since $v\in {A}_u\cup \{v\}$. Therefore, from (23) and (24), we have

$$\begin{aligned} \sigma _h(\{u\})\le 1+\sum _{v\in {N}_u}\left(\sigma _{h-1}(\{v\})\cdot \sum _{{A}_u\subseteq {N}_u\setminus \{v\}}p({A}_u\cup \{v\})\right). \end{aligned}$$

(25)

Furthermore, by definition,

$$\begin{aligned}&\sum _{{A}_u\subseteq {N}_u\setminus \{v\}}p({A}_u\cup \{v\})\nonumber \\&\quad = \sum _{{A}_u\subseteq {N}_u\setminus \{v\}}\left(\left(\prod _{w\in {A}_u\cup \{v\}} p_{u,w}\right)\cdot \left(\prod _{w\in {N}_u\setminus ({A}_u\cup \{v\})}(1-p_{u,w})\right)\right)\nonumber \\&\quad = \sum _{{A}_u\subseteq {N}_u\setminus \{v\}}\left(p_{u,v}\cdot \left(\prod _{w\in {A}_u} p_{u,w}\right)\cdot \left(\prod _{w\in {N}_u\setminus ({A}_u\cup \{v\})}(1-p_{u,w})\right)\right)\nonumber \\&\quad = p_{u,v}\cdot \sum _{{A}_u\subseteq {N}_u\setminus \{v\}}\left(\left(\prod _{w\in {A}_u} p_{u,w}\right)\cdot \left(\prod _{w\in {N}_u\setminus ({A}_u\cup \{v\})}(1-p_{u,w})\right)\right)\nonumber \\&\quad = p_{u,v}\cdot 1\nonumber \\&\quad = p_{u,v}. \end{aligned}$$

(26)

Thus, by (25) and (26), it holds that

$$\begin{aligned} \sigma _h(\{u\})\le 1+\sum _{v\in {N}_u}\left(\sigma _{h-1}(\{v\})\cdot p_{u,v}\right). \end{aligned}$$

(27)

Inequality (11) can be proved by induction. When $h=1$, the inequality follows directly from Inequality (10). Suppose that it holds for $h-1$ hops of propagation, i.e., $\sigma _{h-1}(\{u\}) \le \hat{\sigma }_{h-1}(\{u\})$. Then, for h hops of propagation, we have

$$\begin{aligned} \sigma _h(\{u\})&\le 1+\sum _{v\in {N}_u}\left(p_{u,v}\cdot \sigma _{h-1}(\{v\})\right)\nonumber \\&\le 1+\sum _{v\in {N}_u}\left(p_{u,v}\cdot \hat{\sigma }_{h-1}(\{v\})\right)\nonumber \\&= \hat{\sigma }_{h}(\{u\}). \end{aligned}$$

(28)

Therefore, for any $h\ge 0$, we have $\sigma _{h}(\{u\}) \le \hat{\sigma }_{h}(\{u\})$. $\square$

Proof of Theorem 3

This can be proved using the live edge approach (Kempe et al. 2003).

Under the IC model, for each edge $\langle u,v\rangle\in {E}$, we independently flip a coin of bias $p_{u,v}$ to decide whether the edge ⟨u, v⟩ is live or blocked to generate a sample influence propagation outcome X.
Under the LT model, for each node $v\in V$, it picks at most one of its incoming edge at random—selecting the edge from an inverse neighbor u with probability $p_{u,v}$ and not selecting any incoming edge with probability $1-\sum _{u\in {I}_v}p_{u,v}$.

We use p(X) to denote the probability of a specific outcome X in the sample space. Let ${V}_h^X(v)$ denote the node set that can be reached from a node v within h hops in the sample outcome X. Then, the number of nodes that can be reached from a seed set S within h hops in the outcome X is given by $\sigma _h^X({S})=\Big |\bigcup _{v\in {S}}{V}_h^X(v)\Big |$. Thus,

$$\begin{aligned} \sigma _h({S})=\sum _{X}\left(p(X)\cdot \sigma _h^X({S})\right), \end{aligned}$$

(29)

where the monotonicity of $\sigma _h({S})$ holds since $\sigma _h^X({S})$ increases as S expands.

The marginal influence gain

$$\begin{aligned} \sigma _h^X({S}\cup \{u\})-\sigma _h^X({S})=\Big |{V}_h^X(u)\setminus \bigcup _{v\in {S}}{V}_h^X(v)\Big | \end{aligned}$$

(30)

is the number of nodes that are reachable from a node u within h hops but are not reachable from any node in a seed set S within h hops in a sample outcome X. For any two node sets S and T where ${S}\subseteq {T}$, we have $\bigcup _{v\in {S}}{V}_h^X(v)\subseteq \bigcup _{v\in {T}}{V}_h^X(v)$. Thus, ${V}_h^X(u)\setminus \bigcup _{v\in {S}}{V}_h^X(v)\supseteq {V}_h^X(u)\setminus \bigcup _{v\in {T}}{V}_h^X(v)$, which implies that

$$\begin{aligned} \sigma _h^X({S}\cup \{u\})-\sigma _h^X({S})\ge \sigma _h^X({T}\cup \{u\})-\sigma _h^X({T}). \end{aligned}$$

(31)

Since $p(X)\ge 0$ for any X, taking the linear combination, we have

$$\begin{aligned} \sigma _h({S}\cup \{u\})-\sigma _h({S})\ge \sigma _h({T}\cup \{u\})-\sigma _h({T}). \end{aligned}$$

(32)

Thus, $\sigma _h(\cdot )$ is submodular. $\square$

Proof of Theorem 4

Let ${S}_h^*$ denote the optimal seed set for maximizing the influence spread within h hops of propagation, i.e., $\sigma _h({S}_h^*)=\max _{|{S}|=k}\sigma _h({S})$. We have

$$\begin{aligned} \sigma ({S}_h)&\ge \sigma _h({S}_h)\nonumber \\&\ge \left(\frac{1}{\kappa _{\sigma _h}}(1-e^{-\kappa _{\sigma _h}})\right)\sigma _h({S}_h^*)\nonumber \\&\ge \left(\frac{1}{\kappa _{\sigma _h}}(1-e^{-\kappa _{\sigma _h}})\right)\sigma _h({S}^*)\nonumber \\&=\left(\frac{1}{\kappa _{\sigma _h}}(1-e^{-\kappa _{\sigma _h}})\alpha \right)\sigma ({S}^*) \end{aligned}$$

(33)

The first inequality follows from the fact that the exact influence spread is equal to the influence spread without any hop limitation of propagation. The second inequality is because that the greedy algorithm can achieve $\left(\frac{1}{\kappa _f}(1-e^{-\kappa _f})\right)$-approximation for maximizing a monotone submodular function f with a cardinality constraint (Conforti and Cornuéjols 1984), where the submodularity and monotonicity of $\sigma _h(\cdot )$ is given by Theorem 3. The third inequality is because ${S}_h^*$ is the optimal solution for maximizing $\sigma _h(\cdot )$. $\square$

We first introduce some lemmas used to prove Theorem 5.

Lemma 1

For scale-free random graphs with propagation probability $p_{u,v}=p$ for every edge $\langle u,v\rangle\in {E}$, the expected influence spread produced within one hop of propagation from a random seed set S satisfies

$$\begin{aligned} \mathbb {E}[\sigma _1({S})]\ge (p+1)k-pk^2/|{V}|. \end{aligned}$$

(34)

Proof of Lemma 1

With one hop of propagation, for a randomly selected node v, it is not activated if and only if v is not a seed and v is not activated by any of its inverse neighbors. The probability for v to be a non-seed node is $1-\frac{k}{|{V}|}$. The probability for an inverse neighbor of v to be a seed is $\frac{k}{|V|}$, and thus, the probability for it to activate v is $p\cdot \frac{k}{|{V}|}$. Therefore, the probability for all of v’s inverse neighbors to fail to activate v is

$$\begin{aligned} \prod _{u\in {I}_v}\left(1-p\cdot \frac{k}{|{V}|}\right)=\left(1-\frac{pk}{|{V}|}\right)^{|{I}_v|}. \end{aligned}$$

(35)

Note that if v is selected as a seed, it must be activated. Hence, the overall activation probability of v is

$$\begin{aligned} \pi _1^{S}(v)=1-\left(1-\frac{k}{|{V}|}\right)\cdot \left(1-\frac{pk}{|{V}|}\right)^{|{I}_v|}. \end{aligned}$$

(36)

As a result, the expectation of the activation probability of a random node v is given by

$$\begin{aligned} \mathbb {E}[\pi _1^{S}(v)]&=\mathbb {E}\left [1-\left(1-\frac{k}{|{V}|}\right )\cdot \left(1-\frac{pk}{|{V}|}\right )^{|{I}_v|}\right ]\nonumber \\&=1-\left (1-\frac{k}{|{V}|}\right )\cdot \sum _{|{I}_v|=1}^{\infty }\left (P_0(|{I}_v|)\cdot \left (1-\frac{pk}{|{V}|}\right )^{|{I}_v|}\right )\nonumber \\&\ge 1-\left (1-\frac{k}{|{V}|}\right )\cdot \left (1-\frac{pk}{|{V}|}\right )\cdot \sum _{|{I}_v|=1}^{\infty }P_0(|{I}_v|)\nonumber \\&=1-\left (1-\frac{k}{|{V}|}\right )\cdot \left (1-\frac{pk}{|{V}|}\right )\nonumber \\&=\frac{(1+p)k}{|{V}|}-\frac{pk^2}{|{V}|^2}. \end{aligned}$$

(37)

Therefore, it holds that $\mathbb {E}[\sigma _1({S})]=|{V}|\cdot \mathbb {E}[\pi _1^{S}(v)]\ge (p+1)k-pk^2/|{V}|$. This completes the proof. $\square$

Lemma 2

(Li et al. 2012) For an infinite random power law graph, the expected fraction of nodes activated $\phi ({S})=\mathbb {E}[\sigma ({S})]/|{V}|$ can be computed by

$$\begin{aligned} {\left\{ \begin{array}{ll} 1-\varphi ({S})=\left(1-\frac{k}{|{V}|}\right)\sum _{d=0}^{\infty }P_1(d+1)\left (1-p\varphi ({S})\right )^d,\\ 1-\phi ({S})=\left (1-\frac{k}{|{V}|}\right )\sum _{d=1}^{\infty }P_0(d)\left (1-p\varphi ({S})\right )^d, \end{array}\right. } \end{aligned}$$

(38)

where $P_1(d)=\frac{d^{1-\gamma }}{\sum _{d=1}^{\infty }d^{1-\gamma }}$ is the probability of a node connecting to a neighbor whose degree is d, and $\varphi ({S})$ is an instrumental variable.

Lemma 3

The expected fraction of nodes activated $\phi ({S})$ is bounded by

$$\begin{aligned} \mathbb {E}[\sigma ({S})]\le |{V}|\cdot \left(1-\left(1-\frac{k}{|{V}|}\right)P_0(1)(1-pA)\right), \end{aligned}$$

(39)

where $A=1-\left (1-\frac{k}{|{V}|}\right )P_1(1)$.

Proof of Lemma 3

From (38) in Lemma 2, we have

$$\begin{aligned} 1-\varphi ({S})\ge \left (1-\frac{k}{|{V}|}\right )P_1(1)\left (1-p\varphi ({S})\right )^0=1-A, \end{aligned}$$

(40)

and

$$\begin{aligned} 1-\phi ({S})\ge \left (1-\frac{k}{|{V}|}\right )P_0(1)\left (1-p\varphi ({S})\right ). \end{aligned}$$

(41)

Hence, by (40) and (41), the lemma follows. $\square$

Proof of Theorem 5

Lemma 1 indicates that

$$\begin{aligned} \mathbb {E}[\sigma _h({S})]\ge \mathbb {E}[\sigma _1({S})]\ge (p+1)k-pk^2/|{V}|. \end{aligned}$$

(42)

Lemma 3 indicates that

$$\begin{aligned} E[\sigma ({S})]\le |{V}|\cdot \left (1-\left (1-\frac{k}{|{V}|}\right )P_0(1)(1-pA)\right ). \end{aligned}$$

(43)

Putting (42) and (43) together, the theorem follows. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tang, J., Tang, X. & Yuan, J. An efficient and effective hop-based approach for influence maximization in social networks. Soc. Netw. Anal. Min. 8, 10 (2018). https://doi.org/10.1007/s13278-018-0489-y

Download citation

Received: 30 September 2017
Revised: 10 January 2018
Accepted: 31 January 2018
Published: 10 February 2018
DOI: https://doi.org/10.1007/s13278-018-0489-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient and effective hop-based approach for influence maximization in social networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Linear Time Algorithm for Influence Maximization in Large-Scale Social Networks

Probability-Based Multi-hop Diffusion Method for Influence Maximization in Social Networks

Multi-hop analysis method for rich-club phenomenon of influence maximization in social networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Lemma 1

Proof of Lemma 1

Lemma 2

Lemma 3

Proof of Lemma 3

Proof of Theorem 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now