Elsevier

Knowledge-Based Systems

Volume 166, 15 February 2019, Pages 30-41
Knowledge-Based Systems

Location driven influence maximization: Online spread via offline deployment

https://doi.org/10.1016/j.knosys.2018.12.003Get rights and content

Abstract

Existing works on influence maximization (IM) aim at finding influential online users as seed nodes. Originated from these seed nodes, large online influence spread can be triggered. However, such user-driven perspective limits the IM problem within the purely online environment. Due to the increasing interactions between the cyber world and the physical world, offline events in real world are showing more impact on online information spread. Most IM methods are totally unaware of the cyber–physical interactions and thus their effectiveness is limited when offline events are taken into account. To address this issue, in this paper we consider influence maximization from an online–offline interactive setting and propose the location-driven influence maximization (LDIM) problem. The LDIM problem aims to find the optimal offline deployment of locations and durations to hold events, so as to maximize the online influence spread. We propose a location-driven propagation (LDP) model to describe the online influence spread process triggered by offline events. Under the LDP model, we prove the LDIM problem is NP-hard and computing the objective function is #P-hard. Thus we develop a greedy algorithm over integer lattice and prove its 11e approximation. To overcome the expensive time complexity, we further develop two algorithms with time complexity reduced successively while ensuring provable approximation ratios of 11eε and 11eεε respectively. Experimental results on real datasets show the effectiveness of the proposed algorithms and the significant improvement on scalability.

Introduction

With more and more people using social networking services, recent years have witnessed a boom of information spread through social networks. Reposting influential or popular contents has already become a social norm [1]. As one of the hottest topics in 2016, the US presidential election has generated great information cascades in social media and it is believed the opinion leaders (celebrities or politicians) have affected users’ decisions significantly in supporting the election candidates [2]. Similar phenomena are also frequently observed in online marketing for users adoption of new products [3]. Consequently, how to generate a large online influence spread in social networks has attracted significant attention from both academic and industrial communities [4], [5], [6], [7]. In general, the spread of influence in a networked system, e.g., both the positive idea&information and negative malware&epidemic, are all based on the interactions of users [4], [5], [8], [9], [10]. Therefore, these works all follow the user-driven model, i.e., targeting at a small set of users, called as seed nodes, to trigger a large cascade of online influence spread. By exploiting users’ online social relations, these methods have achieved moderate success in online promotion strategy decision.

However, users’ online social interactions can be profoundly affected by their physical context. The advent of social games such as Pokemon [11] has revealed the gradual convergence of the cyber-world and real-world [12]. Analogously, a well located offline promotion event is usually accompanied by large online influence cascade in social networks, which in turn improves the promotion. For instance, when selecting the location to hold a show for a singer, we need to consider both the number of the audience and their influence in social networks. The audiences coming to the show are likely to post a twitter for sharing the photos and their emotions. It may trigger a large chain of twitter reposting in social networks and boost the popularity of the singer. Therefore, it is crucial to consider the interweaving nature of online influence spread and people’s offline mobile pattern when hosting the offline promotion events. Consequently, an interesting location-driven influence maximization problem follows: given a limited budget to host promotion events, e.g., launching an offline marketing campaigns or arranging outdoor speeches for election, how can we determine the optimal locations and durations for these events so as to maximize the corresponding online influence spread. As far as we know, such location-driven problem has not been studied before.

In this paper, we reformulate the influence maximization from a new location-driven perspective by explicitly considering the social users’ offline–online interactions. The general Location-Driven Influence Maximization (LDIM) problem aims at allocating the given budgets to a set of locations both spatially and temporally for holding events, so as to maximize the online influence spread. For convenience, we give the formal definition of LDIM problem here.

Given a social network G=(V,E) and a set of locations D={d1,,dnd}, we define the budget allocation vector (or simply allocation for short) as x=(x1,,xnd), where xi is the budget allocated to location di. The expected influence spread for an allocation x is defined as f(x).

Definition 1 LDIM

Given an influence propagation model and a budget k, the LDIM problem aims to find an allocation x that maximizes f(x): x=argmaxxf(x)s.t.|x|=i=1ndxik.

A precondition in Definition 1 is a specific propagation model which decides how the online influence can be triggered by offline events and how the objective function f(x) can be evaluated. To this point, the Location-Based Social Networks (LBSNs) have brought about abundant availability of both people’s online social relations and their physical locations, which bridges the gap between the offline promotion events and the online influence spread. Utilizing LBSN data, we will introduce a Location-Driven Propagation (LDP) model to describe the location-driven influence spread process and explain the evaluation of f(x) in the next section.

To better illustrate our idea, we present a toy example in Fig. 1, which contains three locations: X,Y and Z. Everyday, X has a large number of offline visitors with limited online influence. Meanwhile, both A and B are online influential and A visits Y, B visits both Y and Z everyday. Now if we are hosting a promotion event with a budget of 2, that is, budget for holding the event for two days in one location or one day in two locations each. Obviously, if we want to maximize the number of offline audience, the event should be held at location X. However, if we want to trigger large online influence spread, then we should hold the event in location Y for two days since both A&B’s influence will be triggered. If B would not visit location Y, then the optimal budget allocation changes to holding the event at Y and Z one day each. Given the two intuitive allocations in Fig. 1, we will explain why such allocations are optimal under corresponding cases in the next section, by introducing the Location-Driven Propagation Model (presented in Section 3). This example shows the necessity of considering both users’ physical location information and their social relations in location-driven influence maximization problem.

In the following part of this paper, we will discuss related works in Section 2. After that, in Section 3, we will introduce the LDP model and prove that the LDIM problem is NP-hard and computing f(x) is #P-hard for any x. Then in Section 4, we will present a greedy algorithm over integer lattice to approximately solve the LDIM problem. To overcome the expensive time complexity and further improve computational efficiency, we develop two algorithms with time complexity reduced successively with approximation ratios of 11eε and 11eεε respectively, which are presented in Section 5. In Section 6, the experiments on real datasets show the effectiveness of all the three proposed algorithms. We leave the discussion of conclusions and future works in Section 7. The frequently used notations are listed in Table 1.

It is worthwhile to summarize our contributions as follows.

  • 1.

    We correlate offline events with online influence maximization and formulate the practical Location-Driven Influence Maximization (LDIM) problem.

  • 2.

    We propose the Location-Driven Propagation (LDP) model to describe the online influence spread process triggered by the offline events, and prove the NP-hardness of LDIM problem and the #P-hardness of computing the influence spread under LDP model.

  • 3.

    To solve the LDIM problem, we propose a Greedy Framework with provable approximation ratio. To reduce the expensive time complexity, we further propose two algorithms with time complexity reduced successively while ensuring provable approximation ratios.

  • 4.

    We conduct experiments over real-world datasets and the experimental results demonstrate the effectiveness and scalability of the proposed methods.

Section snippets

Related works

The Influence Maximization (IM) problem was first proposed and formulated with a probabilistic framework in [13], [14]. After that, Kempe et al. [4] further formulate it as a discrete optimization problem. After proving its NP-hardness, they propose a greedy framework to solve it with a 11e approximate guarantee. Though the greedy framework is simple and effective, it is not scalable to large size networks with a very high time complexity of Oknmr. Motivated by such shortcoming, numerous

Location-Driven Propagation model

In this section, we introduce the Location-Driven Propagation (LDP) model which describes the online influence spread process triggered by the offline events. Before explaining the LDP model in detail, we first list three key probabilities here.

  • 1.

    m(di,u): It is the probability that user u visits location di at each time step. See the solid lines between user and location in the offline world in Fig. 1.

  • 2.

    γ(u): It is the probability that user u becomes a seed node if he visits a location which is

Greedy algorithm

Given the hardness of LDIM problem, we first present a simple greedy framework in Algorithm 1 that can solve the problem with 11e approximation.

In Algorithm 1, we denote ez as unit allocation which is a vector with 1 at the zth index and 0 otherwise, and 0 as the all zero vector. The whole process of Algorithm 1 is to greedily select ez with largest marginal gain until the budget is exhausted. See f(x) is not a simple set function but over an integer lattice. Based on [56], we have the

Ris-based Greedy algorithm

Due to the inefficiency of the MCG algorithm, we propose a more scalable Ris-based Greedy (RG) algorithm, as shown in Algorithm 3. It mainly derives from the Reverse Influence Sampling (RIS) based methods [6], [21], [22], [23]. Before detailed into Algorithm 3, we first define the RIL set.

Definition 2 Reverse Influence Location Set

Given a graph G, let gd be an instance sampled by the process in Section 3. An RIL set for a node v, denoted as I, is a set containing all the copy nodes dij that can reach v in gd, i.e., I={dij| there is a

Experimental settings

Two real-world LBSNs,2 BrightKite and Gowalla are used in experiments, as listed in Table 2. They both consist of the friendship graph and a list of check-ins.

Conclusions & future works

In this paper, we study a new location-driven influence maximization (LDIM) problem by explicitly considering the social network users’ online social relations and their physical locations. LDIM offers a new location-driven perspective for influence maximization. Utilizing the LBSNs, we introduce an LDP model for describing the location-driven influence spread process. After proving the hardness of the problem, we show the effectiveness of greedy algorithm. To overcome the expensive time

Acknowledgment

This work is supported by National Natural Science Foundation of China (Grant No: U1866602) and Zhejiang Provincial Key Research and Development Plan (Grant no. 2017C01012). It is also partially supported by ByteDance.

References (59)

  • KempeD. et al.

    Maximizing the spread of influence through a social network

  • ChenW. et al.

    Efficient influence maximization in social networks

  • TangY. et al.

    Influence maximization in near-linear time: A martingale approach

  • AslayC. et al.

    Viral marketing meets social advertising: Ad allocation with minimum regret

    Proc. Vldb Endow.

    (2014)
  • LiuW. et al.

    Web malware spread modelling and optimal control strategies

    Sci. Rep.

    (2017)
  • SerinoM. et al.

    Pokmon Go and augmented virtual reality games: A cautionary commentary for parents and pediatricians

    Current Opinion Pediatr.

    (2016)
  • MagerkurthC. et al.

    Pervasive games: Bringing computer entertainment back to the real world

    Comput. Entertain.

    (2005)
  • DomingosP. et al.

    Mining the network value of customers

  • RichardsonM. et al.

    Mining knowledge-sharing sites for viral marketing

  • LeskovecJ. et al.

    Cost-effective outbreak detection in networks

  • GoyalA. et al.

    Celf++: Optimizing the greedy algorithm for influence maximization in social networks

  • ChenW. et al.

    Scalable influence maximization for prevalent viral marketing in large-scale social networks

  • GoyalA. et al.

    A data-based approach to social influence maximization

    Proc. VLDB Endow.

    (2011)
  • KimJ. et al.

    Scalable and parallelizable processing of influence maximization for large-scale social networks

  • BorgsC. et al.

    Maximizing social influence in nearly optimal time

  • TangY. et al.

    Influence maximization: near-optimal time complexity meets practical efficiency

  • NguyenH.T. et al.

    Stop-and-stare: Optimal sampling algorithms for viral marketing in billion-scale networks

  • WangX. et al.

    Bring order into the samples: A novel scalable method for influence maximization

    IEEE Trans. Knowl. Data Eng.

    (2017)
  • WangY. et al.

    Community-based greedy algorithm for mining top-k influential nodes in mobile social networks

  • Cited by (27)

    • Distance-aware optimization model for influential nodes identification in social networks with independent cascade diffusion

      2021, Information Sciences
      Citation Excerpt :

      The following extensions of IM models are query dependent: Location-aware IM (LAIM) [36], Distance-aware IM (DAIM) [41]

    View all citing articles on Scopus
    View full text