Skip to main content
Log in

Maximizing misinformation restriction within time and budget constraints

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

Online social networks have become popular media worldwide. However, they also allow rapid dissemination of misinformation causing negative impacts to users. With a source of misinformation, the longer the misinformation spreads, the greater the number of affected users will be. Therefore, it is necessary to prevent the spread of misinformation in a specific time period. In this paper, we propose maximizing misinformation restriction (\(\mathsf {MMR}\)) problem with the purpose of finding a set of nodes whose removal from a social network maximizes the influence reduction from the source of misinformation within time and budget constraints. We demonstrate that the \(\mathsf {MMR}\) problem is NP-hard even in the case where the network is a rooted tree at a single misinformation node and show that the calculating objective function is #P-hard. We also prove that objective function is monotone and submodular. Based on that, we propose an \(1{-}1/\sqrt{e}\)-approximation algorithm. We further design efficient heuristic algorithms, named \(\mathsf {PR}\)-\(\mathsf {DAG}\) to show \(\mathsf {MMR}\) in very large-scale networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Bhagat S, Goyal A, Lakshmanan LV (2012) Maximizing product adoption in social networks. In: Proceedings of the fifth ACM international conference on Web search and data mining, Seattle, Washington, pp 603–612

  • Budak C, Agrawal D, Abbadi AE (2011) Limiting the spread of misinformation in social networks, In: Proceedings of the 20th international conference on world wide web, WWW ’11, ACM, New York, NY. https://doi.org/10.1145/1963405.1963499

  • Cha M, Mislove A, Gummadi KP (2009) A measurement-driven analysis of information propagation in the Flickr social network. In: Proceedings of the 18th international conference on world wide web, New York, USA, pp 721–730

  • Chen W, Lakshmanan LVS, Castillo C (2013) Information and influence propagation in social networks. Morgan and Claypool, San Rafael

    Google Scholar 

  • Chen W, Lu W, Zang N (2012) Time-critical influence maximization in social networks with time-delayed diffusion process. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligence, Toronto, Ontario, pp 592–598

  • Chen W, Wang C, Wang Y (2010) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, pp 1029–1038

  • Chen W, Wang C, Wang Y (2010) Scalable influence maximization in social networks under the linear threshold model. In: Proceedings of the 2010 IEEE international conference on data mining, Washington, pp 88–97

  • Domingos P, Richardson M (2001) Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, New York, USA, pp 57–66

  • Domm P (2013) False rumor of explosion at white house causes stocks to briefly plunge, AP Confirms Its Twitter Feed Was Hacked CNBC. http://www.cnbc.com/id/100646197(2013). Accessed 23 April 2013

  • Gentzkow M (2017) Social media and fake news in the 2016 election. Stanford Web. http://news.stanford.edu/2017/01/18/stanford-study-examines-fake-news-2016-presidential-election/ (2017). Accessed 24 June 2017

  • Goyal A, Lu W, Lakshmanan LVS (2012) SIMPATH: an efficient algorithm for influence maximization under the linear threshold model. In: Proceeding IEEE 11th international conference on data mining, Vancouver, BC, pp 211–220

  • He X, Song G, Chen W, Jiang Q (2011) Influence blocking maximization in social networks under the competitive linear threshold model technical report, CoRR abs/1110.4723

  • Hughes AL, Palen L (2009) Twitter adoption and use in mass convergence and emergency events. Int J Emerg Manage 6(3):248–260

    Article  Google Scholar 

  • Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. https://doi.org/10.1145/956750.956769

  • Khalil EB, Dilkina B,Song L (2014) Scalable diffusion-aware optimization of network topology. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, New York, pp 1226–1235

  • Khuller S, Moss A, Naor JS (1999) The budgeted maximum coverage problem. Inf Process Lett 70(1):39–45

    Article  MathSciNet  MATH  Google Scholar 

  • Kimura M, Saito K, Motoda H (2008) Solving the contamination minimization problem on networks for the linear threshold model. In: Pacific rim international conference on artificial intelligence. https://doi.org/10.1007/978-3-540-89197-0_94

  • Kimura M, Saito K, Motoda H (2009) Blocking links to minimize contamination spread in a social network. In: ACM transactions on knowledge discovery from data. https://doi.org/10.1145/1514888.1514892

  • Kottasov I (2017) Facebook targets 30,000 fake accounts in France. CNN media Web. http://money.cnn.com/2017/04/14/media/facebook-fake-news-france-election/index.html. Accessed 24 June 2017

  • Kwon S, Cha M, Jung K, Chen W, and Wang Y (2013) Prominent features of rumor propagation in online social media. In: Proceeding of IEEE 13th international conference on data mining. https://doi.org/10.1109/ICDM.2013.61

  • Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web. https://doi.org/10.1145/1232722

    Google Scholar 

  • Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceeding of ACM SIGKDD international conference on knowledge discovery and data mining. https://doi.org/10.1145/1081870.1081893

  • Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2

    Article  Google Scholar 

  • Liu B, Cong G, Xu D, Zheng Y (2012) Time constrained influence maximization in social networks. In: Proceeding of IEEE 12th international conference on data mining, Belgium, Brussels, pp 439–448

  • Nguyen H, Zheng R (2013) On budgeted influence maximization in social networks. IEEE J Sel Areas Commun 31(6):1084–1094

    Article  Google Scholar 

  • Nguyen NP, Yan G, Thai MT (2013) Analysis of misinformation containment in online social networks. Comput Netw 57:21332146

    Article  Google Scholar 

  • Qazvinian V, Rosengren E, Radev DR, Mei Q (2011) Rumor has it: identifying misinformation in microblogs. In: Proceedings of the conference on empirical methods in natural language processing, Edinburgh, pp 1589–1599

  • Richardson M, Agrawal R, Domingos P (2003)Trust management for the semantic web. In: Proceeding of international semantic web conference. https://doi.org/10.1007/978-3-540-39718-2_23

  • Sutter JD (2017) How bin Laden news spread on Twitter. CNN Web. http://edition.cnn.com/2011/TECH/social.media/05/02/osama.bin.laden.twitter/index.html. Accessed 23 June 2017

  • Valiant LG (1979) The complexity of enumeration and reliability problems. SIAM J Comput 8(3):410–421

    Article  MathSciNet  MATH  Google Scholar 

  • Wolfsfeld G, Segev E, Sheafer T (2013) Social media and the Arab Spring: politics comes first. Int J Press Polit 18(2):115–137

    Article  Google Scholar 

  • Yadron D (2017) Twitter deletes 125,000 Isis accounts and expands anti-terror teams. The Guardian Web. https://www.theguardian.com/technology/2016/feb/05/twitter-deletes-isis-accounts-terrorism-online. Accessed 24 June 2017

  • Zhang H, Alim M, Li X, My TT, Nguyen H (2016a) Misinformation in online social networks: catch them all with limited budget. ACM Trans Inf Syst 34(3):18

  • Zhang Y, Adigay A, Saha S, Vullikanti A, Prakash A (2016b) Near-optimal algorithms for controlling propagation at group scale on networks. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2016.2605088

  • Zhang H, Dinh TN, Thai MT (2013) Maximizing the spread of positive influence in online social networks. In: Proceeding IEEE 33rd international conference on distributed computing systems, Philadelphia, PA, pp 317-326

  • Zhang H, Kuhnle A, Zhang H, Thai MT (2016c) Detecting misinformation in online social networks before it is too late. In: Proceeding IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). https://doi.org/10.1109/ASONAM.2016.7752288

  • Zhang Y, Prakash B (2015) Data-aware vaccine allocation over large networks. ACM Trans Knowl Discov Data. https://doi.org/10.1145/2803176

  • Zhang Y, Prakash BA (2014) Scalable vaccine distribution in large graphs given uncertain data. In: Proceeding of the ACM international conference on information and knowledge management, Shanghai, pp 1719–1728

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to My T. Thai.

Appendix

Appendix

We let \(e=(u, v), f=(u', v')\), \(\mathcal {G}^e(G\setminus X)\) is the set of sample graph where incoming edge \(e=(u, v)\) is selected for node v, \(\mathcal {G}^{\overline{e}}(G\setminus X)\) is the set of sample graph where a different incoming edge \(\overline{e}=(y, v)\) is selected for node v and \(\mathcal {G}^{\emptyset }(G\setminus X)\) is the set of sample graph where no incoming edge \(\overline{e}=(y, v)\) is selected for node v. According to Khalil et al. (2014), we have the following results:

Proposition 2

(Khalil et al. (2014), proposition 1) For every live-edge \(g \in \mathcal {G}^{\emptyset }(G \setminus X)\), there exits a corresponding live-edge graph \(\widetilde{g} \in \mathcal {G}^e(G \setminus X)\) and vice versa. If \(g=(V, E_g)\) then \(\widetilde{g}=(V, E_g \cup \{e\} )\).

Proposition 3

(Khalil et al. (2014), proposition 2) \(\mathcal {G}(G \setminus (X\cup \{e\})) \subseteq \mathcal {G}(G \setminus X)\) and furthermore \(\mathcal {G}(G \setminus (X\cup \{e\}))=\mathcal {G}^{\overline{e}}(G \setminus X) \cup \mathcal {G}^{\emptyset }(G \setminus X)\).

Proposition 4

(Khalil et al. (2014), proposition 3) Given \(f=(u', v') \in E \setminus X, v' \ne v\), let \(t=|\mathcal {G}^{\emptyset }(G \setminus (X \cup \{f\}))|\) then \(\mathcal {G}^{\emptyset }(G \setminus X)\) can be partitioned into t sets \(\{ \varPhi _i \}_{i=1}^t\) such that, for every \(\varPhi _i\) there exits a corresponding \(g_i \in \mathcal {G}^{\emptyset }( G \setminus (X \cup \{f\}))\) and vice versa.

Proposition 5

(Khalil et al. (2014), proposition 4) For every \(\varPhi _i \subseteq \mathcal {G}^{\emptyset }(G \setminus X)\) and its associated \(g_i \in \mathcal {G}^{\emptyset }(G \setminus (X \cup \{e\}))\), \(\Pr [g_i|G \setminus (X \cup \{f \})]= \sum _{H \in \varPhi _i} \Pr [H|G \setminus X]\).

Proof of Lemma 3

We need to show that \( \sigma _{d,E}(S, X) \ge \sigma _{d,E}(S, X \cup \{e\})\). The idea of the proof is similar to the theorem 5 in Khalil et al. (2014). Using Proposition 2 and  3, we have:

$$\begin{aligned} \sigma _{d,E}(S, X) - \sigma _{d,E}(S,X \cup \{e\})= \sum _{g \in \mathbb {G}(G\setminus X)}\Pr [g|G \setminus X]f_d(g,S) \\ - \sum _{g \in \mathcal {G}(G \setminus (X \cup \{e\}))}\Pr [g|G \setminus (X \cup \{e\})]f_d(g,S) = \sum _{g \in \mathcal {G}^e(G \setminus X)}{\Pr [g|G \setminus X]\cdot f_d(g,S)} \\ + \sum _{g \in \mathcal {G}^{\emptyset }(G \setminus X)}\Big ( \Pr [g|G \setminus X)]- \Pr [g|G \setminus (X \cup \{e\})]\Big )\cdot f_d(g,S) \\ + \sum _{g \in \mathcal {G}^{\overline{e}}(G \setminus X)}\Big ( \Pr [g|G \setminus X)]- \Pr [g|G \setminus (X\cup \{e\})]\Big )\cdot f_d(g,S) \end{aligned}$$

Recall that \(e=(u, v)\), for \(g \in \mathcal {G}^{\emptyset }(G \setminus X)\), we have:

$$\begin{aligned} \Pr [g|G \setminus X]-\Pr [g|G \setminus (X \cup \{e\})]= -w(u, v)\prod _{v' \ne v} P(v', g, G \setminus X) \end{aligned}$$
(35)

For \(g \in \mathcal {G}^{\overline{e}}(G \setminus X)\) we have \(P(v, g, G \setminus X)=P(v, g, G \setminus (X\cup \{e\}))=w(\overline{e})\) which leads to \(\Pr [g|G \setminus X]=\Pr [g|G \setminus (X\cup \{e\})]\), it infers:

$$\begin{aligned} \sigma _{d,E}(S, X) - \sigma _{d,E}(S, X \cup \{e\})= \sum _{g \in \mathcal {G}^e(G \setminus X)}{\Pr [g|G \setminus X]\cdot f_d(g,S)} \\ + \sum _{g \in \mathcal {G}^{\emptyset }(G \setminus X)}{-w(u, v)\prod _{v' \ne v}p(v', g, G \setminus X)\cdot f_d(g,S)} \end{aligned}$$

Using prop. 2, for \(g\in \mathcal {G}^{\emptyset }(G \setminus X)\) there exits a corresponding \(\widetilde{g} \in \mathcal {G}^e(G \setminus X)\) and \(\Pr [\widetilde{g}|G \setminus X]=w(u, v)\prod _{v' \ne v}p(v', \widetilde{g}, G \setminus X)\). Therefore,

$$\begin{aligned} \sigma _{d,E}(S, X) - \sigma _{d,E}(S, X \cup \{e\})= \sum _{g \in \mathcal {G}^{\emptyset }(G \setminus X)} \Pr [\widetilde{g}|G \setminus X] \Big ( f_d(\widetilde{g}, S) - f_d(g, S) \Big ) \end{aligned}$$
(36)

We can see that g is a subgraph of \(\widetilde{g}\), the set of vertices which can reach from S in g is subset of the set of vertices which can reach from S in \(\widetilde{g}\). Hence, \(f_d(\widetilde{g}, S) - f_d(g, S) \ge 0\), which completes the proof. \(\square \)

Proof of Lemma 4

the idea of the proof is similar to that of theorem 6 in Khalil et al. (2014). For edge \(f \in E\), let \(t=|\mathcal {G}(G \setminus (X\{f\}))|\). From Proposition 4, we can partition \(\mathcal {G}^{\emptyset }(G \setminus X)\) into t sets \(\{\varPhi \}_{i=1}^t\), rewrite (36) as:

$$\begin{aligned} \sigma _{d,E}(S, X) - \sigma _{d,E}(S, X \cup \{e\})= \sum _{g \in \mathcal {G}^{\emptyset }(G \setminus X)} \Pr [\widetilde{g}|G \setminus X] \Big ( f_d(\widetilde{g}, S) - f_d(g, S) \Big ) \nonumber \\ =\sum _{i=1}^t\sum _{g \in \varPhi _i} \Pr [\widetilde{g}|G \setminus X] \Big ( f_d(\widetilde{g}, S) - f_d(g, S) \Big ) \end{aligned}$$
(37)

Using similar reasoning to that in Eq. (36) in the proof of lemma 1 for \(G \setminus (X \cup \{f \})\), we have:

$$\begin{aligned}&\sigma _{d,E}(S, X \cup \{f\}) - \sigma _{d,E}(S, X \cup \{f, e\}) \nonumber \\&=\sum _{g \in \mathcal {G}^{\emptyset }(G \setminus (X \cup \{f \}))} \Pr [\widetilde{g}|G \setminus (X \cup \{f\})] \Big ( f_d(\widetilde{g}, S) - f_d(g, S) \Big ) \end{aligned}$$
(38)

We will compare two Eqs. (37) and (38) term by term for each \(g_i \in \mathcal {G}^{\emptyset }(G \setminus X), i=1,\ldots ,t \). It can be divided into two cases: (1) \(\varPhi _i =\{X_i\}\) in case in \(g_i\) has another incoming edge to \(v'\) not f, now the terms in two equations are equal; (2) in the case in \(g_i\) has only incoming edge to \(v'\) is f, then \(\varPhi _i=\{g_i, g_i'\}\), we need to prove:

$$\begin{aligned} \Pr [\widetilde{g_i}|G \setminus X] \Big ( f_d(\widetilde{g_i}, S) - f_d(g_i, S) \Big )+\Pr [\widetilde{g'_i}|G \setminus X] \Big ( f_d(\widetilde{g'_i}, S) - f_d(g'_i, S) \Big ) \nonumber \\ \ge \Pr [\widetilde{g_i}|G \setminus (X \cup \{f\})] \Big ( f_d(\widetilde{g_i}, S) - f_d(g_i, S) \Big ) \end{aligned}$$
(39)

Using prop. 5 in Khalil et al. (2014), we have:

$$\begin{aligned} \Pr [\widetilde{g_i}|G \setminus (X \cup \{f\})] = \Pr [\widetilde{g_i}|G \setminus X] + \Pr [\widetilde{g'_i}|G \setminus X] \end{aligned}$$
(40)

Hence, inequality (39) is true if:

$$\begin{aligned} f_d(\widetilde{g'_i}, S) - f_d(g'_i, S) \ge f_d(\widetilde{g_i}, S) - f_d(g_i, S) \end{aligned}$$
(41)

Note that \(\widetilde{g'_i}=(V, E_{g_i} \cup \{f\})\) and live-edge graphs are constructed in a way that each node has at most one incoming edge. We can see that: a reachability path in \(\widetilde{g_i}\) is clearly presented in \(\widetilde{g'_i}\), hence if removing edge e from \(\widetilde{g_i}\) results in unreachability of some nodes in \(g_i\). Similarly, some nodes become unreachable when removing e from \(\widetilde{g'_i}\). Removing edge e from \(\widetilde{g'_i}\) may disconnect some additional nodes whose paths derived from the source including edge f. Therefore, the reduction in reachable nodes when removing edge e from \(\widetilde{g'_i}\) is the same or larger than the reduction when removing e from \(\widetilde{g_i}\), it implies (41) is true. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pham, C.V., Thai, M.T., Duong, H.V. et al. Maximizing misinformation restriction within time and budget constraints. J Comb Optim 35, 1202–1240 (2018). https://doi.org/10.1007/s10878-018-0252-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-018-0252-3

Keywords

Navigation