A numerical evaluation of the accuracy of influence maximization algorithms

Kingi, Hautahi; Wang, Li-An Daniel; Shafer, Tom; Huynh, Minh; Trinh, Mike; Heuser, Aaron; Rochester, George; Paredes, Antonio

doi:10.1007/s13278-020-00680-5

A numerical evaluation of the accuracy of influence maximization algorithms

Original Article
Published: 24 August 2020

Volume 10, article number 70, (2020)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Hautahi Kingi ORCID: orcid.org/0000-0002-5913-0972¹,
Li-An Daniel Wang²,
Tom Shafer³,
Minh Huynh⁴,
Mike Trinh⁵,
Aaron Heuser⁸,
George Rochester⁶ &
…
Antonio Paredes⁷

325 Accesses
4 Citations
6 Altmetric
Explore all metrics

Abstract

We develop an algorithm to compute exact solutions to the influence maximization problem using concepts from reverse influence sampling (RIS). We implement the algorithm using GPU resources to evaluate the empirical accuracy of theoretically guaranteed greedy and RIS approximate solutions. We find that the approximations yield solutions that are remarkably close to optimal—usually achieving greater than 99% of the optimal influence spread. This accuracy is consistent across a wide range of network structures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PMF-GRN: a variational inference approach to single-cell gene regulatory network inference using probabilistic matrix factorization

Article Open access 08 April 2024

Introduction to Bioinformatics

Complex Networks: a Mini-review

Article 13 July 2020

Notes

“Exact” solutions in this article will technically be \((1-\epsilon )\) approximations, which is consistent with the use of that term in much of the literature (Li et al. 2017, 2019).
We assume \(p_e=p\), \(\forall e\) throughout.
This is Lemma 1 in, for example, Huang et al. (2017).
The value of \(\theta \) determines both the runtime of the algorithm and \(\epsilon \). However, the relationship between \(\theta \) and \(\epsilon \) is a function of the optimal solution, as shown in Theorem 1. The literature has focused on determining increasingly tighter values of \(\theta \) to reduce the runtime through various techniques like limiting the total number of edges examined during the generation process to a pre-defined threshold (Borgs et al. 2014), using Chernoff bounds (Tang et al. 2014) and adopting martingale methods (Tang et al. 2015), among others (Nguyen et al. 2016; Huang et al. 2017). We focus on the two computational steps common to all RIS methods and set \(\theta =100{,}000\). Because this affects both the approximate and exact solutions equally, the proportional difference between the solutions is approximately independent of \(\theta \) so long as it ensures \(\epsilon<< e\), which it trivially does.
Examples of various versions of this lemma in the literature are Lemma 3 in Tang et al. (2014), Lemma 3 in Tang et al. (2015), and Lemma 5 in Nguyen et al. (2016).
The \(\beta =1\) version is not identical to the Erdős–Rényi model because it enforces each node to have at least K/2 connections, whereas there is no restriction on edges for a given node in Erdős–Rényi.
The Python code to generate all results is available at https://github.com/hautahi/IM-Evaluation.

References

Akbarpour M, Malladi S, Saberi A (2018) Diffusion, seeding, and the value of network information. In: Proceedings of the 2018 ACM conference on economics and computation. ACM, pp 641–641
Albert R, Jeong H, Barabási A-L (2000) Error and attack tolerance of complex networks. Nature 406(6794):378
Article Google Scholar
Bader DA, Madduri K (2008) Snap, small-world network analysis and partitioning: an open-source parallel graph framework for the exploration of large-scale networks. In: 2008 IEEE international symposium on parallel and distributed processing. IEEE, pp 1–12
Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Article MathSciNet Google Scholar
Barnat J, Bauch P, Brim L, Ceska M (2011) Computing strongly connected components in parallel on cuda. IEEE Int Parallel Distrib Process Sympos 2011:544–555
Google Scholar
Basaras P, Katsaros D (2019) Identifying influential spreaders in complex networks with probabilistic links. In: Social networks and surveillance for society. Springer, Cham, pp 57–84
Chapter Google Scholar
Bollobás B, Riordan O (2004) Robustness and vulnerability of scale-free random graphs. Internet Math 1(1):1–35
Article MathSciNet Google Scholar
Borgs C, Brautbar M, Chayes J, Lucier B (2014) Maximizing social influence in nearly optimal time. In: Proceedings of the twenty-fifth annual ACM-SIAM symposium on discrete algorithms. SIAM, pp 946–957
Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 199–208
Chen W, Wang C, Wang Y (2010) Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1029–1038
Chen D-B, Xiao R, Zeng A (2014) Predicting the evolution of spreading on complex networks. Sci Rep 4:6108
Article Google Scholar
Cohen R, Erez K, Ben-Avraham D, Havlin S (2000) Resilience of the internet to random breakdowns. Phys Rev Lett 85(21):4626
Article Google Scholar
Domingos P, Richardson M (2001) Mining the network value of customers. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 57–66
Domingos P, Richardson M (2002) Mining knowledge-sharing sites for viral marketing. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 61–70
Emami N, Mozafari N, Hamzeh A (2018) Continuous state online influence maximization in social network. Soc Netw Anal Min 8(1):32
Article Google Scholar
Erdős P, Rényi A (1960) On the evolution of random graphs. Publ Math Inst Hung Acad Sci 5(1):17–60
MathSciNet MATH Google Scholar
Galhotra S, Arora A, Roy S (2016) Holistic influence maximization: combining scalability and efficiency with opinion-aware models. In: Proceedings of the 2016 international conference on management of data. ACM, pp 743–758
Goyal A, Lu W, Lakshmanan LV (2011) Celf++: optimizing the greedy algorithm for influence maximization in social networks. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 47–48
Harish P, Narayanan P (2007) Accelerating large graph algorithms on the GPU using CUDA. In: International conference on high-performance computing. Springer, pp 197–208
He X, Kempe D (2016) Robust influence maximization. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 885–894
Huang K, Wang S, Bevilacqua G, Xiao X, Lakshmanan L (2017) Revisiting the stop-and-stare algorithms for influence maximization. Proc VLDB Endow 10:913–924
Article Google Scholar
Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 137–146
Kim J, Kim SK, Yu H (2013) Scalable and parallelizable processing of influence maximization for large-scale social networks? In: IEEE 29th international conference on data engineering (ICDE). IEEE, pp 266–277
LaSalle D, Karypis G (2013) Multi-threaded graph partitioning. In: IEEE 27th international symposium on parallel and distributed processing. IEEE, pp 225–236
Leskovec J, Krause A, Guestrin C, Faloutsos C, VanBriesen J, Glance N (2007) Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 420–429
Li X, Smith JD, Dinh TN, Thai MT (2017) Why approximate when you can get the exact? Optimal targeted viral marketing at scale. In: IEEE INFOCOM 2017-IEEE conference on computer communications. IEEE, pp 1–9
Li X, Smith JD, Dinh TN, Thai MT (2019) Tiptop:(almost) exact solutions for influence maximization in billion-scale networks. IEEE/ACM Trans Netw 27(2):649–661
Article Google Scholar
Liu X, Li M, Li S, Peng S, Liao X, Lu X (2013) Imgpu: GPU-accelerated influence maximization in large-scale social networks. IEEE Trans Parallel Distrib Syst 25(1):136–145
Google Scholar
Marro J, Dickman R (2005) Nonequilibrium phase transitions in lattice models. Cambridge University Press, Aléa-Saclay
MATH Google Scholar
Molloy M, Reed B (1995) A critical point for random graphs with a given degree sequence. Random Struct Algorithms 6(2–3):161–180
Article MathSciNet Google Scholar
Moore C, Newman ME (2000) Epidemics and percolation in small-world networks. Phys Rev E 61(5):5678
Article Google Scholar
More J, Lingam C (2019) A gradient-based methodology for optimizing time for influence diffusion in social networks. Soc Netw Anal Min 9(1):5
Article Google Scholar
Morone F, Makse HA (2015) Influence maximization in complex networks through optimal percolation. Nature 524(7563):65
Article Google Scholar
Newman ME (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102
Article Google Scholar
Nguyen HT, Thai MT, Dinh TN (2016) Stop-and-stare: optimal sampling algorithms for viral marketing in billion-scale networks. In: Proceedings of the 2016 international conference on management of data. ACM, pp 695–710
Pastor-Satorras R, Vespignani A (2001) Epidemic dynamics and endemic states in complex networks. Phys Rev E 63(6):066117
Article Google Scholar
Piraveenan M, Harré M, Kasthurirathna D (2016) Optimising influence in social networks using bounded rationality models. Soc Netw Anal Min 6:54
Article Google Scholar
Srivastava A, Chelmis C, Prasanna V (2015) The unified model of social influence and its application in influence maximization. Soc Netw Anal Min 5:66
Article Google Scholar
Tang Y, Xiao X, Shi Y (2014) Influence maximization: near-optimal time complexity meets practical efficiency. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data. ACM, pp 75–86
Tang Y, Shi Y, Xiao X (2015) Influence maximization in near-linear time: a martingale approach. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data. ACM, pp 1539–1554
Tang J, Tang X, Yuan J (2018) An efficient and effective hop-based approach for influence maximization in social networks. Soc Netw Anal Min 8:10
Article Google Scholar
Tsugawa S, Ohsaki H (2018) Robustness of influence maximization against non-adversarial perturbations. In: IEEE/ACM international conference on advances in social networks analysis and mining. Springer, Cham, pp 193–210
Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440
Article Google Scholar

Download references

Author information

Authors and Affiliations

Facebook Inc, New York, NY, USA
Hautahi Kingi
Sam Houston State University, Huntsville, TX, USA
Li-An Daniel Wang
Elder Research, Inc, Raleigh, NC, USA
Tom Shafer
IMPAQ International, Washington DC, USA
Minh Huynh
IMPAQ International, Boston, MA, USA
Mike Trinh
Northeastern University, Boston, MA, USA
George Rochester
U.S. Food and Drug Administration, Silverspring, MD, USA
Antonio Paredes
IMPAQ International, Columbia, MD, USA
Aaron Heuser

Authors

Hautahi Kingi
View author publications
You can also search for this author in PubMed Google Scholar
Li-An Daniel Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tom Shafer
View author publications
You can also search for this author in PubMed Google Scholar
Minh Huynh
View author publications
You can also search for this author in PubMed Google Scholar
Mike Trinh
View author publications
You can also search for this author in PubMed Google Scholar
Aaron Heuser
View author publications
You can also search for this author in PubMed Google Scholar
George Rochester
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Paredes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hautahi Kingi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kingi, H., Wang, LA.D., Shafer, T. et al. A numerical evaluation of the accuracy of influence maximization algorithms. Soc. Netw. Anal. Min. 10, 70 (2020). https://doi.org/10.1007/s13278-020-00680-5

Download citation

Received: 26 March 2020
Revised: 22 July 2020
Accepted: 24 July 2020
Published: 24 August 2020
DOI: https://doi.org/10.1007/s13278-020-00680-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A numerical evaluation of the accuracy of influence maximization algorithms

Abstract

Access this article

Similar content being viewed by others

PMF-GRN: a variational inference approach to single-cell gene regulatory network inference using probabilistic matrix factorization

Introduction to Bioinformatics

Complex Networks: a Mini-review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A numerical evaluation of the accuracy of influence maximization algorithms

Abstract

Access this article

Similar content being viewed by others

PMF-GRN: a variational inference approach to single-cell gene regulatory network inference using probabilistic matrix factorization

Introduction to Bioinformatics

Complex Networks: a Mini-review

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation