Efficient diversified influence maximization with adaptive policies
Introduction
With the exponential growth of social network users, recent years have witnessed a boom of information spread in social media. Consequently, influence maximization (IM) problem [1], [2] in social media has attracted abundant attention. The IM problem aims to select nodes as seed nodes and utilize the “word-of-mouth” [3], [4] effect to spread the information for activating other nodes in the social network. By convincing the seeds to adopt a product (or an idea, a service, etc.), the other activated nodes are regarded as adopting the product as well. The goal of the IM problem is to choose the optimal seeds such that the expected number of activated nodes in the social network is maximized. The IM problem finds many applications in viral marketing [5], [6], network monitoring [7], [8], rumor control [9], [10], [11] and so on.
A hot line of IM research is to study the problem by considering additional information. Topic-aware IM problem considers the topics of information to be spread. The possibility that a node adopts the information is affected by the interest of the node to the topic [12], [13], [14]. Time-aware IM problem models the propagation rate of information over social networks in order to activate more nodes before they are influenced by information from competitors [15], [16]. Location-aware IM problem takes into account the geographical locations of nodes in maximizing information propagation [17], [18].
Existing studies on influence maximization mainly focus on maximizing the number of activated nodes. The diversity of the activated nodes, although bearing great importance in many practical applications, has been mostly overlooked. For instance, users in a social network naturally form different communities. In marketing campaigns, having a diverse target audience among different communities could bring many benefits, such as reducing the risk of marketing campaigns [19]. Diversity also benefits recommendation systems as the diversity of recommendations is increasingly recognized as an important aspect of recommendation quality [20], [21], [22]. As the proverb goes: “Don’t put all your eggs in one basket”, spreading influence among diverse groups is an intrinsic aspect of IM research. However, to the best of our knowledge, [19], [23] are the only previous works exploring the diversity over the activated nodes in influence maximization. [19] aims to optimize a weighted sum of the influence spread and diversity while [23] model the diversity in influence spread using three commonly used utilities in economics. The object functions in both [19] and [23] are difficult to optimize in a scalable manner. This will limit the applicability of the methods as the importance of diversity is more emphasized in large networks.
To address this issue, in this paper we propose a practical diversified influence maximization (DIM) problem. The DIM problem aims to select nodes such that both the number of activated nodes and the diversity of the activated nodes can be maximized. We follow the framework of [19] such that the objective function is modeled as a weighted sum of the influence spread and diversity. To tackle the NP-hardness of the DIM problem, we employ the reverse influence sampling technique [24], [25] and propose the DIMM algorithm to approximately solve the DIM problem in an efficient manner. The DIMM algorithm can return a -approximate solution with at least probability.
In real applications, the influence spread is highly stochastic and unforeseen events might occur [11], [26], [27], [28]. The above DIM setting assumes that the seed nodes are all selected at the very beginning and leaves no reserved measures for handling unforeseen events. A more reasonable policy is to adaptively invest the budget based on the observation of influence spread as time goes by. Therefore, we further propose the adaptive diversified influence maximization (ADIM) problem. In each time round, the available observation offers the evidence for estimating the future reduction on influence spread of the current seeds. Accordingly, we can decide whether to select new seeds and which seeds to select. With the adaptive policy, we can have reserved budget for handling the case if the influence spread dies out quickly. By careful modification of the DIMM algorithm with an adaptive setting, we propose and implement -greedy adaptive policies that can return approximation solution with reasonable error bounds.
Finally, we evaluate our algorithm against existing algorithms on 4 real datasets (two of them are large-scale datasets with more than one million edges.). The experimental results validate the effectiveness and efficiency of the proposed algorithms.
It is worthwhile to list our contributions as follows.
- 1.
In this paper we propose the practical diversified influence maximization (DIM) problem and theoretically analyze the hardness, monotonicity, and submodularity of the DIM problem.
- 2.
We design an approximation algorithm DIMM to solve the DIM problem with a new data structure reverse influence sketch (RI-sketch) constructed. We show that the DIMM algorithm can achieve an approximation ratio of at least and near-linear time complexity.
- 3.
Considering the requirement for an adaptive environment, we further propose adaptive diversified influence maximization (ADIM) problem and design an -Greedy Policy to approximately solve it, which ensures a -dependent approximation ratio.
- 4.
By extending the RI-sketch data structure, we design an efficient implementation of the -Greedy Policy, with provable error bound of the approximation ratio.
- 5.
We construct extensive experiments on four real-world datasets. The experimental results demonstrate the effectiveness and efficiency of the proposed algorithms.
The rest of this paper is organized as follows. We briefly review related works in Section 2. We present the DIM problem and its solution in Sections 3 Diversified influence maximization, 4 Solution for DIM problem. We then present the Adaptive-DIM problem and its solution in Sections 5 Adaptive-DIM problem, 6 Solution for ADIM problem. The experimental results and discussions are presented in Section 7. Finally, we conclude the paper and present some directions for future work in Section 8. Note that all proofs are shown in the appendix.
Section snippets
Related works
The Influence Maximization (IM) problem was first proposed in [3], [5]. Following the probabilistic framework formulated in [5], Kempe et al. [1] modeled it as a discrete optimization problem. They prove the problem is NP-hard and propose a greedy framework to solve it with a approximation guarantee. Subsequent studies mainly focus on reducing the running time of the greedy algorithm. It has been shown that the branch-and-bound approach can provide higher empirical efficiency while
Diversified influence maximization
Formally, a social network is modeled as a directed graph , where is the set of nodes and the set of edges, denoted by the number of nodes and the number of edges. To facilitate the presentation, we first introduce the classic independent cascade (IC) model for information propagation [1].
Solution for DIM problem
If the whole social network has only one community as itself, then we can see the DIM problem degenerates to the traditional IM problem. Thus they suffer at least the same hardness for solving the problem [1] and computing the objective function [2].
Theorem 1 The DIM problem is NP-hard. Moreover, computing and for any set are both #P-hard.
Nevertheless, the objective function possesses nice properties that allow approximation algorithms to be designed.
Lemma 1 Function is monotone and submodular[19]
Adaptive-DIM problem
The above DIM problem adopts the same one-shot formulation of the IM problem, i.e., budget is exhausted with seed nodes all selected and activated at the beginning. Nothing is done in the subsequent influence spread process. In fact, the influence spread is highly stochastic. Though the probability is low, it would die out quickly. Thus a more flexible and effective strategy is to adaptively select seed nodes for multi rounds by observing the influence spread results in previous rounds. In this
Solution for ADIM problem
We first show the adaptive monotonicity and adaptive submodularity which are the theoretical basis of the proposed policies. For convenience, we define as the expected value of objective function of S-pair set chosen by policy under the realization . According to the analysis in [48], we have that the function is not adaptive submodularity. Thus greedy approach cannot be directly applied. In the following, we will propose a modified greedy strategy that
Experiment
Conclusion
Most of the existing influence maximization works focus on maximizing the number of the activated nodes while ignoring the diversity of the activated nodes. In this paper, we propose the diversified influence maximization (DIM) problem and the corresponding DIMM algorithm to approximately solve it. With carefully designed data structure RI-sketch, the DIMM algorithm can achieve a -approximate solution with at least probability and time complexity near-linear to the network size.
CRediT authorship contribution statement
Can Wang: Conceptualization, Formal analysis, Resources, Writing - original draft, Writing - review & editing, Supervision, Project administration. Qihao Shi: Methodology, Software, Validation, Formal analysis, Investigation, Writing - original draft, Writing - review & editing. Weizhao Xian: Conceptualization, Methodology, Formal analysis, Writing - original draft, Writing - review & editing. Yan Feng: Resources, Supervision, Funding acquisition. Chun Chen: Resources, Supervision, Funding
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is funded by National Key R&D Program of China (Grant No: 2018AAA0101505) and State Grid Corporation of China Scientific and Technology Project, China : Fundamental Theory of Human-in-the-loop Hybrid-Augmented Intelligence for Power Grid Dispatch and Control.
References (53)
- et al.
Location driven influence maximization: Online spread via offline deployment
Knowl.-Based Syst.
(2019) - et al.
Big social network influence maximization via recursively estimating influence spread
Knowl.-Based Syst.
(2016) - et al.
Bring order into the samples: A novel scalable method for influence maximization
IEEE Trans. Knowl. Data Eng.
(2017) - et al.
CoFIM: A community-based framework for influence maximization on large-scale networks
Knowl.-Based Syst.
(2017) - et al.
Community-based influence maximization in social networks under a competitive linear threshold model
Knowl.-Based Syst.
(2017) - et al.
Predicting information diffusion probabilities in social networks: A Bayesian networks based approach
Knowl.-Based Syst.
(2017) - et al.
Post and repost: A holistic view of budgeted influence maximization
Neurocomputing
(2019) - et al.
Maximizing the spread of influence through a social network
- et al.
Efficient influence maximization in social networks
- et al.
Mining the network value of customers
Talk of the network: A complex systems look at the underlying process of word-of-mouth
Mark. Lett.
Mining knowledge-sharing sites for viral marketing
Scalable influence maximization for prevalent viral marketing in large-scale social networks
Cost-effective outbreak detection in networks
Inferring networks of diffusion and influence
Influence blocking maximization in social networks under the competitive linear threshold model
Adaptive influence blocking: Minimizing the negative spread by observation-based policies
Online topic-aware influence maximization queries
Topic-aware social influence propagation models
Online topic-aware influence maximization
Proc. VLDB Endowment
Efficient location-aware influence maximization
Diversified social influence maximization
Cited by (12)
Dynamic node influence tracking based influence maximization on dynamic social networks
2022, Microprocessors and MicrosystemsCitation Excerpt :Several improvements have also been proposed by exploiting various topological properties of the network. It includes heuristic methods such as DegreeDiscount (DD) algorithm [13], and LTR method [14], path-based methods such as matrix influence (MATI) [15], random sampling-based methods such as RIS [16], TIM+ [17], TPH [18], DIMM [19]. The IM has also been studied in realistic situations such as community-based influence maximization [20], location-aware [21], and context-aware [22].
Influence maximization in social networks using effective community detection
2022, Physica A: Statistical Mechanics and its ApplicationsCitation Excerpt :Finally, Section 5 provides conclusions and suggestions for further research. Influence maximization problem was first proposed by Dominguez and Richardson in 2001 and then many scientists have investigated this [20,27–31]. The greedy algorithm takes a long execution time due to the multiple computations of the influence spread and Monte Carlo simulations.
Robust Sequence Networked Submodular Maximization
2023, Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023