Abstract
Influence maximization is an important research topic in social networks that has different applications such as analyzing spread of rumors, interest, adoption of innovations, and feed ranking. The goal is to select a limited size subset of vertices (called a seed-set) in a Social Graph, so that upon their activation, a maximum number of vertices of the graph become activated, due to the influence of the vertices on each other. The linear threshold model is one of two classic stochastic propagation models that describe the spread of influence in a network. We present a new approach called MLPR (matrix multiplication, linear programming, randomized rounding) with linear programming used as its core in order to solve the influence maximization problem in the linear threshold model. Experiments on four real data sets have shown the efficiency of the MLPR method in solving the influence maximization problem in the linear threshold model. The spread of the output seed-sets is as large as when the state-of-the-art algorithms are used; however, unlike most of the existing algorithms, the runtime of our method is independent of the seed size and does not increase with it.
Similar content being viewed by others
Notes
The runtime for solving linear programs depends on the number of variables and constraints of the linear program.
In graph theory a simple path is a path in a graph which does not have repeating vertices.
An edge from a node to itself.
References
Ackerman E, Ben-Zwi O, Wolfovitz G (2010) “Combinatorial Model and Bounds for Target Set Selection.” Theoretical Comp Sci 411(44–46): 4017–4022 https://doi.org/10.1016/j.tcs.2010.08.021https://linkinghub.elsevier.com/retrieve/pii/S0304397510004561.
Chen W, Lakshmanan LVS, Castillo C (2013) “Information and Influence Propagation in Social Networks.” Synthesis Lectures Data Manage 5 (4) 1–177 http://www.morganclaypool.com/doi/abs/https://doi.org/10.2200/S00527ED1V01Y201308DTM037.
Chen W, Wang C, Wang Y (2010) “Scalable IM for Prevalent Viral Marketing in Large-Scale Social Networks.” In ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 1029. New York ACM Press, New York, USAhttps://doi.org/10.1145/1835804.1835934. https://dl.acm.org/citation.cfm?doid=1835804.1835934.
Chen W, Yuan Y, Zhang L(2010) “Scalable IM in Social Networks Under LT.” In IEEE International Confrence on Data Mining, 88–97. https://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5693962.
Domingos P, Richardson M (2001) “Mining the Network Value of Customers.” In ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 57–66. New York, New York, USA: ACM Press. doi:https://doi.org/10.1145/502512.502525. https://portal.acm.org/citation.cfm?doid=502512.502525.
Le Gall F (2014) “Powers of Tensors and Fast Matrix Multiplication.” Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation (ISSAC 2014): 1–28. https://arxiv.org/abs/1401.7714.
GhayourBaghbani F, Asadpour M, Faili H (2019) Integer LP for IM. Iran J Sci Technol Trans Electr Eng 43:627. https://doi.org/10.1007/s40998-019-00178-7
Goyal A, Lu W, Lakshmanan LKS (2009) “CELF ++ : Optimizing the Greedy Algorithm for IM in Social Networks.” In International World Wide Web Conference. https://www.cs.ubc.ca/~goyal/research/celf++.pdf.
Goyal A, Lu W, Lakshmanan LKS (2011) “SIMPATH: An Efficient Algorithm for IM Under LT.” In 2011 IEEE 11th International Conference on Data Mining, 211–220. IEEE. https://doi.org/10.1109/ICDM.2011.132. https://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6137225.
Güney E (2019) An Efficient LP Based Method for the IM Problem in Social Networks. Information Sci 503:589–605. https://doi.org/10.1016/j.ins.2019.07.043
Huang K, Wang S, Bevilacqua G, Xiao X, Lakshmanan LKS (2017) “Revisiting the Stop-and-Stare Algorithms for IM.” In International Conference on Computational Social Networks, 10:913–924. https://link.springer.com/chapter/https://doi.org/10.1007/978-3-030-04648-4_23.
Jung K, Heo W, Chen W (2012) “IRIE: Scalable and Robust IM in Social Networks.” In IEEE International Confrence on Data Mining, 1–19. https://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6413832.
Karmarkar N (1984) A new polynomial time algorithm for LP. Combinatorica 4:373–395. https://doi.org/10.1007/BF02579150https://web.archive.org/web/20131228145520/http://retis.sssup.it/~bini/teaching/optim2010/karmarkar.pdf
Kempe D, Kleinberg J, Tardos E (2003) “Maximizing the Spread of Influence through a Social Network.” In ACM SIGKDD Conference on Knowledge Discovery and Data Mining. https://dl.acm.org/citation.cfm?id=956769.
Kempe D, Kleinberg J, Tardos E (2005) Influential Nodes in a Diffusion Model for Social Networks. Autom Languages Progr. https://doi.org/10.1007/11523468_91
Keskİn ME, Güler G (2018) “IM in Social Networks : an Integer Programming Approach.” Turk J Elec Eng Comp Sci 26: 3383–3396 https://doi.org/10.3906/elk-1802-212. https://www.semanticscholar.org/paper/Influence-maximization-in-social-networks%3A-an-Keskin-G%C3%BCler/4730a82b77cf16c9fc8a7d12cb2a5cab7fb87478.
Küçükyavuz H, Simge W (2017) “A Two-Stage Stochastic Programming Approach for IM in Social Networks.” Computational Optim Appl https://doi.org/10.1007/s10589-017-9958-x.
Leiserson C, Arthur E, Smith C, Randall KH (1998) Cilk: Effcient Multithreaded Computing Cilk. Ph.D. thesis at the. Mass Inst Technol. pp 1–179. https://supertech.csail.mit.edu/papers/randall-phdthesis.pdf
Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance N (2007) “Cost-Effective Outbreak Detection in Networks.” In ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 420. New York: ACM Press, New York USA https://doi.org/10.1145/1281192.1281239. https://portal.acm.org/citation.cfm?doid=1281192.1281239.
Li X, Smith JD, Dinh TN, Thai MT (2019) “TipTop : ( Almost ) Exact Solutions for IM in Billion-Scale Networks.” In IEEE/ACM Trans on Netw (TON), 649–661. https://doi.org/10.1109/TNET.2019.2898413.
Meindl B, Templ M (2012) Analysis of Commercial and Free and Open Source Solvers for Linear Optimization Problems. In: Technical Report 1; Institut F, Statistik U, Eds. Wahrscheinlichkeitstheorie: Wien, Austria. https://www.statistik.tuwien.ac.at/forschung/CS/CS-2012-1complete.pdf
Nguyen HT, MT Thai, TN Dinh (2016) Stop-and-Stare : Optimal Sampling Algorithms for Viral Marketing in Billion-Scale Networks. In: SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data, pp 695–710 https://doi.org/10.1145/2882903.2915207
Raghavan, Prabhakar; Tompson, Clark D. (1987), "Randomized rounding: A technique for provably good algorithms and algorithmic proofs", Combinatorica, 7 (4): 365–374 https://doi.org/10.1007/BF02579324 Tang, Youze (2015) “IM in Near-Linear Time : A Martingale Approach.” In SIGMOD, 1539–1554. https://dl.acm.org/citation.cfm?id=2723734.
Wang C, W Chen, Y Wang (2012) “Scalable IM for IC in Large-Scale Social Networks.” In ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 25:545–576 https://doi.org/10.1007/s10618-012-0262-1.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: improvement percentage of spread
Appendix A: improvement percentage of spread
We observed in Fig. 1 that spread of MLPR is more than Monte Carlo Greedy algorithm; however, it has similar performance to SIMPATH and Martingale. In this appendix we compared them in more details. Tables
3,
4,
5 and
6 show improvement percentage of the MLPR method over the previous methods in terms of the spread.
MC10 and MC100 in Tables 3 and 4 show big improvements for MLPR (except for MC100 in NetHEPT dataset with seed size 5).
The columns SIMPATH 10–2, SIMPATH 10–3 and Martingale contain both positive and negative small percentages. We observe that as the size of datasets grows (from NetHept which is the smallest dataset to Twitter which is the biggest dataset) both negative percentages decreases and the absolute value of the improvements approaches to zero. These results plus considering the fact that computing the exact spread of the resulted seed-sets is NP-hard and the computed spreads are approximations, leads us to state that “MLPR along with SIMPATH and Martingale produce similar spreads”. So we have to compare them with other performance factors, e.g., execution time (Tables 3, 4, 5 to 6).
Rights and permissions
About this article
Cite this article
Ghayour-Baghbani, F., Asadpour, M. & Faili, H. MLPR: Efficient influence maximization in linear threshold propagation model using linear programming. Soc. Netw. Anal. Min. 11, 3 (2021). https://doi.org/10.1007/s13278-020-00704-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-020-00704-0