Location driven influence maximization: Online spread via offline deployment
Introduction
With more and more people using social networking services, recent years have witnessed a boom of information spread through social networks. Reposting influential or popular contents has already become a social norm [1]. As one of the hottest topics in 2016, the US presidential election has generated great information cascades in social media and it is believed the opinion leaders (celebrities or politicians) have affected users’ decisions significantly in supporting the election candidates [2]. Similar phenomena are also frequently observed in online marketing for users adoption of new products [3]. Consequently, how to generate a large online influence spread in social networks has attracted significant attention from both academic and industrial communities [4], [5], [6], [7]. In general, the spread of influence in a networked system, e.g., both the positive idea&information and negative malware&epidemic, are all based on the interactions of users [4], [5], [8], [9], [10]. Therefore, these works all follow the user-driven model, i.e., targeting at a small set of users, called as seed nodes, to trigger a large cascade of online influence spread. By exploiting users’ online social relations, these methods have achieved moderate success in online promotion strategy decision.
However, users’ online social interactions can be profoundly affected by their physical context. The advent of social games such as Pokemon [11] has revealed the gradual convergence of the cyber-world and real-world [12]. Analogously, a well located offline promotion event is usually accompanied by large online influence cascade in social networks, which in turn improves the promotion. For instance, when selecting the location to hold a show for a singer, we need to consider both the number of the audience and their influence in social networks. The audiences coming to the show are likely to post a twitter for sharing the photos and their emotions. It may trigger a large chain of twitter reposting in social networks and boost the popularity of the singer. Therefore, it is crucial to consider the interweaving nature of online influence spread and people’s offline mobile pattern when hosting the offline promotion events. Consequently, an interesting location-driven influence maximization problem follows: given a limited budget to host promotion events, e.g., launching an offline marketing campaigns or arranging outdoor speeches for election, how can we determine the optimal locations and durations for these events so as to maximize the corresponding online influence spread. As far as we know, such location-driven problem has not been studied before.
In this paper, we reformulate the influence maximization from a new location-driven perspective by explicitly considering the social users’ offline–online interactions. The general Location-Driven Influence Maximization (LDIM) problem aims at allocating the given budgets to a set of locations both spatially and temporally for holding events, so as to maximize the online influence spread. For convenience, we give the formal definition of LDIM problem here.
Given a social network and a set of locations , we define the budget allocation vector (or simply allocation for short) as , where is the budget allocated to location . The expected influence spread for an allocation is defined as .
Definition 1 LDIM Given an influence propagation model and a budget , the LDIM problem aims to find an allocation that maximizes :
A precondition in Definition 1 is a specific propagation model which decides how the online influence can be triggered by offline events and how the objective function can be evaluated. To this point, the Location-Based Social Networks (LBSNs) have brought about abundant availability of both people’s online social relations and their physical locations, which bridges the gap between the offline promotion events and the online influence spread. Utilizing LBSN data, we will introduce a Location-Driven Propagation (LDP) model to describe the location-driven influence spread process and explain the evaluation of in the next section.
To better illustrate our idea, we present a toy example in Fig. 1, which contains three locations: X,Y and Z. Everyday, X has a large number of offline visitors with limited online influence. Meanwhile, both A and B are online influential and A visits Y, B visits both Y and Z everyday. Now if we are hosting a promotion event with a budget of 2, that is, budget for holding the event for two days in one location or one day in two locations each. Obviously, if we want to maximize the number of offline audience, the event should be held at location X. However, if we want to trigger large online influence spread, then we should hold the event in location Y for two days since both A&B’s influence will be triggered. If B would not visit location Y, then the optimal budget allocation changes to holding the event at Y and Z one day each. Given the two intuitive allocations in Fig. 1, we will explain why such allocations are optimal under corresponding cases in the next section, by introducing the Location-Driven Propagation Model (presented in Section 3). This example shows the necessity of considering both users’ physical location information and their social relations in location-driven influence maximization problem.
In the following part of this paper, we will discuss related works in Section 2. After that, in Section 3, we will introduce the LDP model and prove that the LDIM problem is NP-hard and computing is #P-hard for any . Then in Section 4, we will present a greedy algorithm over integer lattice to approximately solve the LDIM problem. To overcome the expensive time complexity and further improve computational efficiency, we develop two algorithms with time complexity reduced successively with approximation ratios of and respectively, which are presented in Section 5. In Section 6, the experiments on real datasets show the effectiveness of all the three proposed algorithms. We leave the discussion of conclusions and future works in Section 7. The frequently used notations are listed in Table 1.
It is worthwhile to summarize our contributions as follows.
- 1.
We correlate offline events with online influence maximization and formulate the practical Location-Driven Influence Maximization (LDIM) problem.
- 2.
We propose the Location-Driven Propagation (LDP) model to describe the online influence spread process triggered by the offline events, and prove the NP-hardness of LDIM problem and the #P-hardness of computing the influence spread under LDP model.
- 3.
To solve the LDIM problem, we propose a Greedy Framework with provable approximation ratio. To reduce the expensive time complexity, we further propose two algorithms with time complexity reduced successively while ensuring provable approximation ratios.
- 4.
We conduct experiments over real-world datasets and the experimental results demonstrate the effectiveness and scalability of the proposed methods.
Section snippets
Related works
The Influence Maximization (IM) problem was first proposed and formulated with a probabilistic framework in [13], [14]. After that, Kempe et al. [4] further formulate it as a discrete optimization problem. After proving its NP-hardness, they propose a greedy framework to solve it with a approximate guarantee. Though the greedy framework is simple and effective, it is not scalable to large size networks with a very high time complexity of O. Motivated by such shortcoming, numerous
Location-Driven Propagation model
In this section, we introduce the Location-Driven Propagation (LDP) model which describes the online influence spread process triggered by the offline events. Before explaining the LDP model in detail, we first list three key probabilities here.
- 1.
: It is the probability that user visits location at each time step. See the solid lines between user and location in the offline world in Fig. 1.
- 2.
: It is the probability that user becomes a seed node if he visits a location which is
Greedy algorithm
Given the hardness of LDIM problem, we first present a simple greedy framework in Algorithm 1 that can solve the problem with approximation.
In Algorithm 1, we denote as unit allocation which is a vector with 1 at the th index and 0 otherwise, and as the all zero vector. The whole process of Algorithm 1 is to greedily select with largest marginal gain until the budget is exhausted. See is not a simple set function but over an integer lattice. Based on [56], we have the
Ris-based Greedy algorithm
Due to the inefficiency of the MCG algorithm, we propose a more scalable Ris-based Greedy (RG) algorithm, as shown in Algorithm 3. It mainly derives from the Reverse Influence Sampling (RIS) based methods [6], [21], [22], [23]. Before detailed into Algorithm 3, we first define the RIL set.
Definition 2 Reverse Influence Location Set Given a graph , let be an instance sampled by the process in Section 3. An RIL set for a node , denoted as , is a set containing all the copy nodes that can reach in , i.e., there is a
Experimental settings
Two real-world LBSNs,2 BrightKite and Gowalla are used in experiments, as listed in Table 2. They both consist of the friendship graph and a list of check-ins.
Conclusions & future works
In this paper, we study a new location-driven influence maximization (LDIM) problem by explicitly considering the social network users’ online social relations and their physical locations. LDIM offers a new location-driven perspective for influence maximization. Utilizing the LBSNs, we introduce an LDP model for describing the location-driven influence spread process. After proving the hardness of the problem, we show the effectiveness of greedy algorithm. To overcome the expensive time
Acknowledgment
This work is supported by National Natural Science Foundation of China (Grant No: U1866602) and Zhejiang Provincial Key Research and Development Plan (Grant no. 2017C01012). It is also partially supported by ByteDance.
References (59)
- et al.
Modeling and analyzing the dynamic spreading of epidemic malware by a network eigenvalue method
Appl. Math. Model.
(2018) - et al.
Modeling cyber rumor spreading over mobile social networks: A compartment approach
Appl. Math. Comput.
(2019) - et al.
Big social network influence maximization via recursively estimating influence spread
Knowl.-Based Syst.
(2016) - et al.
CoFIM: A community-based framework for influence maximization on large-scale networks
Knowl.-Based Syst.
(2017) - et al.
Community-based influence maximization in social networks under a competitive linear threshold model
Knowl.-Based Syst.
(2017) - et al.
Predicting information diffusion probabilities in social networks: A Bayesian networks based approach
Knowl.-Based Syst.
(2017) - et al.
Location-aware influence maximization over dynamic social streams
ACM Trans. Inf. Syst.
(2018) - et al.
Rain: social role-aware information diffusion
- et al.
The political blogosphere and the 2004 U.S. election: Divided they blog
- et al.
Opinion leadership and social contagion in new product diffusion
Marketing Science
(2011)
Maximizing the spread of influence through a social network
Efficient influence maximization in social networks
Influence maximization in near-linear time: A martingale approach
Viral marketing meets social advertising: Ad allocation with minimum regret
Proc. Vldb Endow.
Web malware spread modelling and optimal control strategies
Sci. Rep.
Pokmon Go and augmented virtual reality games: A cautionary commentary for parents and pediatricians
Current Opinion Pediatr.
Pervasive games: Bringing computer entertainment back to the real world
Comput. Entertain.
Mining the network value of customers
Mining knowledge-sharing sites for viral marketing
Cost-effective outbreak detection in networks
Celf: Optimizing the greedy algorithm for influence maximization in social networks
Scalable influence maximization for prevalent viral marketing in large-scale social networks
A data-based approach to social influence maximization
Proc. VLDB Endow.
Scalable and parallelizable processing of influence maximization for large-scale social networks
Maximizing social influence in nearly optimal time
Influence maximization: near-optimal time complexity meets practical efficiency
Stop-and-stare: Optimal sampling algorithms for viral marketing in billion-scale networks
Bring order into the samples: A novel scalable method for influence maximization
IEEE Trans. Knowl. Data Eng.
Community-based greedy algorithm for mining top-k influential nodes in mobile social networks
Cited by (27)
Complementary influence maximization under comparative linear threshold model[Formula presented]
2024, Expert Systems with ApplicationsDistance-aware optimization model for influential nodes identification in social networks with independent cascade diffusion
2021, Information SciencesCitation Excerpt :The following extensions of IM models are query dependent: Location-aware IM (LAIM) [36], Distance-aware IM (DAIM) [41]
Efficient influence spread management via budget allocation at community scale
2021, Expert Systems with ApplicationsEfficient parallel computing on the game theory-aware robust influence maximization problem
2021, Knowledge-Based SystemsProfit maximization for competitive social advertising
2021, Theoretical Computer Science