Abstract
Online social networks have become popular media worldwide. However, they also allow rapid dissemination of misinformation causing negative impacts to users. With a source of misinformation, the longer the misinformation spreads, the greater the number of affected users will be. Therefore, it is necessary to prevent the spread of misinformation in a specific time period. In this paper, we propose maximizing misinformation restriction (\(\mathsf {MMR}\)) problem with the purpose of finding a set of nodes whose removal from a social network maximizes the influence reduction from the source of misinformation within time and budget constraints. We demonstrate that the \(\mathsf {MMR}\) problem is NP-hard even in the case where the network is a rooted tree at a single misinformation node and show that the calculating objective function is #P-hard. We also prove that objective function is monotone and submodular. Based on that, we propose an \(1{-}1/\sqrt{e}\)-approximation algorithm. We further design efficient heuristic algorithms, named \(\mathsf {PR}\)-\(\mathsf {DAG}\) to show \(\mathsf {MMR}\) in very large-scale networks.













Similar content being viewed by others
References
Bhagat S, Goyal A, Lakshmanan LV (2012) Maximizing product adoption in social networks. In: Proceedings of the fifth ACM international conference on Web search and data mining, Seattle, Washington, pp 603–612
Budak C, Agrawal D, Abbadi AE (2011) Limiting the spread of misinformation in social networks, In: Proceedings of the 20th international conference on world wide web, WWW ’11, ACM, New York, NY. https://doi.org/10.1145/1963405.1963499
Cha M, Mislove A, Gummadi KP (2009) A measurement-driven analysis of information propagation in the Flickr social network. In: Proceedings of the 18th international conference on world wide web, New York, USA, pp 721–730
Chen W, Lakshmanan LVS, Castillo C (2013) Information and influence propagation in social networks. Morgan and Claypool, San Rafael
Chen W, Lu W, Zang N (2012) Time-critical influence maximization in social networks with time-delayed diffusion process. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligence, Toronto, Ontario, pp 592–598
Chen W, Wang C, Wang Y (2010) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, pp 1029–1038
Chen W, Wang C, Wang Y (2010) Scalable influence maximization in social networks under the linear threshold model. In: Proceedings of the 2010 IEEE international conference on data mining, Washington, pp 88–97
Domingos P, Richardson M (2001) Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, New York, USA, pp 57–66
Domm P (2013) False rumor of explosion at white house causes stocks to briefly plunge, AP Confirms Its Twitter Feed Was Hacked CNBC. http://www.cnbc.com/id/100646197(2013). Accessed 23 April 2013
Gentzkow M (2017) Social media and fake news in the 2016 election. Stanford Web. http://news.stanford.edu/2017/01/18/stanford-study-examines-fake-news-2016-presidential-election/ (2017). Accessed 24 June 2017
Goyal A, Lu W, Lakshmanan LVS (2012) SIMPATH: an efficient algorithm for influence maximization under the linear threshold model. In: Proceeding IEEE 11th international conference on data mining, Vancouver, BC, pp 211–220
He X, Song G, Chen W, Jiang Q (2011) Influence blocking maximization in social networks under the competitive linear threshold model technical report, CoRR abs/1110.4723
Hughes AL, Palen L (2009) Twitter adoption and use in mass convergence and emergency events. Int J Emerg Manage 6(3):248–260
Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. https://doi.org/10.1145/956750.956769
Khalil EB, Dilkina B,Song L (2014) Scalable diffusion-aware optimization of network topology. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, New York, pp 1226–1235
Khuller S, Moss A, Naor JS (1999) The budgeted maximum coverage problem. Inf Process Lett 70(1):39–45
Kimura M, Saito K, Motoda H (2008) Solving the contamination minimization problem on networks for the linear threshold model. In: Pacific rim international conference on artificial intelligence. https://doi.org/10.1007/978-3-540-89197-0_94
Kimura M, Saito K, Motoda H (2009) Blocking links to minimize contamination spread in a social network. In: ACM transactions on knowledge discovery from data. https://doi.org/10.1145/1514888.1514892
Kottasov I (2017) Facebook targets 30,000 fake accounts in France. CNN media Web. http://money.cnn.com/2017/04/14/media/facebook-fake-news-france-election/index.html. Accessed 24 June 2017
Kwon S, Cha M, Jung K, Chen W, and Wang Y (2013) Prominent features of rumor propagation in online social media. In: Proceeding of IEEE 13th international conference on data mining. https://doi.org/10.1109/ICDM.2013.61
Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web. https://doi.org/10.1145/1232722
Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceeding of ACM SIGKDD international conference on knowledge discovery and data mining. https://doi.org/10.1145/1081870.1081893
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: densification and shrinking diameters. ACM Trans Knowl Discov Data 1(1):2
Liu B, Cong G, Xu D, Zheng Y (2012) Time constrained influence maximization in social networks. In: Proceeding of IEEE 12th international conference on data mining, Belgium, Brussels, pp 439–448
Nguyen H, Zheng R (2013) On budgeted influence maximization in social networks. IEEE J Sel Areas Commun 31(6):1084–1094
Nguyen NP, Yan G, Thai MT (2013) Analysis of misinformation containment in online social networks. Comput Netw 57:21332146
Qazvinian V, Rosengren E, Radev DR, Mei Q (2011) Rumor has it: identifying misinformation in microblogs. In: Proceedings of the conference on empirical methods in natural language processing, Edinburgh, pp 1589–1599
Richardson M, Agrawal R, Domingos P (2003)Trust management for the semantic web. In: Proceeding of international semantic web conference. https://doi.org/10.1007/978-3-540-39718-2_23
Sutter JD (2017) How bin Laden news spread on Twitter. CNN Web. http://edition.cnn.com/2011/TECH/social.media/05/02/osama.bin.laden.twitter/index.html. Accessed 23 June 2017
Valiant LG (1979) The complexity of enumeration and reliability problems. SIAM J Comput 8(3):410–421
Wolfsfeld G, Segev E, Sheafer T (2013) Social media and the Arab Spring: politics comes first. Int J Press Polit 18(2):115–137
Yadron D (2017) Twitter deletes 125,000 Isis accounts and expands anti-terror teams. The Guardian Web. https://www.theguardian.com/technology/2016/feb/05/twitter-deletes-isis-accounts-terrorism-online. Accessed 24 June 2017
Zhang H, Alim M, Li X, My TT, Nguyen H (2016a) Misinformation in online social networks: catch them all with limited budget. ACM Trans Inf Syst 34(3):18
Zhang Y, Adigay A, Saha S, Vullikanti A, Prakash A (2016b) Near-optimal algorithms for controlling propagation at group scale on networks. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2016.2605088
Zhang H, Dinh TN, Thai MT (2013) Maximizing the spread of positive influence in online social networks. In: Proceeding IEEE 33rd international conference on distributed computing systems, Philadelphia, PA, pp 317-326
Zhang H, Kuhnle A, Zhang H, Thai MT (2016c) Detecting misinformation in online social networks before it is too late. In: Proceeding IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). https://doi.org/10.1109/ASONAM.2016.7752288
Zhang Y, Prakash B (2015) Data-aware vaccine allocation over large networks. ACM Trans Knowl Discov Data. https://doi.org/10.1145/2803176
Zhang Y, Prakash BA (2014) Scalable vaccine distribution in large graphs given uncertain data. In: Proceeding of the ACM international conference on information and knowledge management, Shanghai, pp 1719–1728
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
We let \(e=(u, v), f=(u', v')\), \(\mathcal {G}^e(G\setminus X)\) is the set of sample graph where incoming edge \(e=(u, v)\) is selected for node v, \(\mathcal {G}^{\overline{e}}(G\setminus X)\) is the set of sample graph where a different incoming edge \(\overline{e}=(y, v)\) is selected for node v and \(\mathcal {G}^{\emptyset }(G\setminus X)\) is the set of sample graph where no incoming edge \(\overline{e}=(y, v)\) is selected for node v. According to Khalil et al. (2014), we have the following results:
Proposition 2
(Khalil et al. (2014), proposition 1) For every live-edge \(g \in \mathcal {G}^{\emptyset }(G \setminus X)\), there exits a corresponding live-edge graph \(\widetilde{g} \in \mathcal {G}^e(G \setminus X)\) and vice versa. If \(g=(V, E_g)\) then \(\widetilde{g}=(V, E_g \cup \{e\} )\).
Proposition 3
(Khalil et al. (2014), proposition 2) \(\mathcal {G}(G \setminus (X\cup \{e\})) \subseteq \mathcal {G}(G \setminus X)\) and furthermore \(\mathcal {G}(G \setminus (X\cup \{e\}))=\mathcal {G}^{\overline{e}}(G \setminus X) \cup \mathcal {G}^{\emptyset }(G \setminus X)\).
Proposition 4
(Khalil et al. (2014), proposition 3) Given \(f=(u', v') \in E \setminus X, v' \ne v\), let \(t=|\mathcal {G}^{\emptyset }(G \setminus (X \cup \{f\}))|\) then \(\mathcal {G}^{\emptyset }(G \setminus X)\) can be partitioned into t sets \(\{ \varPhi _i \}_{i=1}^t\) such that, for every \(\varPhi _i\) there exits a corresponding \(g_i \in \mathcal {G}^{\emptyset }( G \setminus (X \cup \{f\}))\) and vice versa.
Proposition 5
(Khalil et al. (2014), proposition 4) For every \(\varPhi _i \subseteq \mathcal {G}^{\emptyset }(G \setminus X)\) and its associated \(g_i \in \mathcal {G}^{\emptyset }(G \setminus (X \cup \{e\}))\), \(\Pr [g_i|G \setminus (X \cup \{f \})]= \sum _{H \in \varPhi _i} \Pr [H|G \setminus X]\).
Proof of Lemma 3
We need to show that \( \sigma _{d,E}(S, X) \ge \sigma _{d,E}(S, X \cup \{e\})\). The idea of the proof is similar to the theorem 5 in Khalil et al. (2014). Using Proposition 2 and 3, we have:
Recall that \(e=(u, v)\), for \(g \in \mathcal {G}^{\emptyset }(G \setminus X)\), we have:
For \(g \in \mathcal {G}^{\overline{e}}(G \setminus X)\) we have \(P(v, g, G \setminus X)=P(v, g, G \setminus (X\cup \{e\}))=w(\overline{e})\) which leads to \(\Pr [g|G \setminus X]=\Pr [g|G \setminus (X\cup \{e\})]\), it infers:
Using prop. 2, for \(g\in \mathcal {G}^{\emptyset }(G \setminus X)\) there exits a corresponding \(\widetilde{g} \in \mathcal {G}^e(G \setminus X)\) and \(\Pr [\widetilde{g}|G \setminus X]=w(u, v)\prod _{v' \ne v}p(v', \widetilde{g}, G \setminus X)\). Therefore,
We can see that g is a subgraph of \(\widetilde{g}\), the set of vertices which can reach from S in g is subset of the set of vertices which can reach from S in \(\widetilde{g}\). Hence, \(f_d(\widetilde{g}, S) - f_d(g, S) \ge 0\), which completes the proof. \(\square \)
Proof of Lemma 4
the idea of the proof is similar to that of theorem 6 in Khalil et al. (2014). For edge \(f \in E\), let \(t=|\mathcal {G}(G \setminus (X\{f\}))|\). From Proposition 4, we can partition \(\mathcal {G}^{\emptyset }(G \setminus X)\) into t sets \(\{\varPhi \}_{i=1}^t\), rewrite (36) as:
Using similar reasoning to that in Eq. (36) in the proof of lemma 1 for \(G \setminus (X \cup \{f \})\), we have:
We will compare two Eqs. (37) and (38) term by term for each \(g_i \in \mathcal {G}^{\emptyset }(G \setminus X), i=1,\ldots ,t \). It can be divided into two cases: (1) \(\varPhi _i =\{X_i\}\) in case in \(g_i\) has another incoming edge to \(v'\) not f, now the terms in two equations are equal; (2) in the case in \(g_i\) has only incoming edge to \(v'\) is f, then \(\varPhi _i=\{g_i, g_i'\}\), we need to prove:
Using prop. 5 in Khalil et al. (2014), we have:
Hence, inequality (39) is true if:
Note that \(\widetilde{g'_i}=(V, E_{g_i} \cup \{f\})\) and live-edge graphs are constructed in a way that each node has at most one incoming edge. We can see that: a reachability path in \(\widetilde{g_i}\) is clearly presented in \(\widetilde{g'_i}\), hence if removing edge e from \(\widetilde{g_i}\) results in unreachability of some nodes in \(g_i\). Similarly, some nodes become unreachable when removing e from \(\widetilde{g'_i}\). Removing edge e from \(\widetilde{g'_i}\) may disconnect some additional nodes whose paths derived from the source including edge f. Therefore, the reduction in reachable nodes when removing edge e from \(\widetilde{g'_i}\) is the same or larger than the reduction when removing e from \(\widetilde{g_i}\), it implies (41) is true. \(\square \)
Rights and permissions
About this article
Cite this article
Pham, C.V., Thai, M.T., Duong, H.V. et al. Maximizing misinformation restriction within time and budget constraints. J Comb Optim 35, 1202–1240 (2018). https://doi.org/10.1007/s10878-018-0252-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10878-018-0252-3