Elsevier

Knowledge-Based Systems

Volume 161, 1 December 2018, Pages 111-123
Knowledge-Based Systems

Influence maximization in social networks under Deterministic Linear Threshold Model

https://doi.org/10.1016/j.knosys.2018.07.040Get rights and content

Abstract

We define the new Targeted and Budgeted Influence Maximization under Deterministic Linear Threshold Model problem and develop the novel and scalable TArgeted and BUdgeted Potential Greedy (TABU-PG) algorithm which allows for optional methods to solve this problem. It is an iterative and greedy algorithm that relies on investing in potential future gains when choosing seed nodes. We suggest new real-world mimicking techniques for generating influence weights, thresholds, profits, and costs. Extensive computational experiments on four real network (Epinions, Academia, Pokec and Inploid) show that our proposed heuristics perform significantly better than benchmarks. We equip TABU-PG with novel scalability methods which reduce runtime by limiting the seed node candidate pool, or by selecting more nodes at once, trading-off with spread performance.

Introduction

Social process of influence intensely and frequently takes place among people. As a result, people’s decisions and behaviours are influenced by others. Such influence can be observed in marketing, consumer behaviour, politics, persuasion, peer pressure, conformity, and leadership.

Furthermore, influence may occur at a conscious level or at an unconscious level. At a conscious level, people choose whether to be influenced or not as a result of some rational decision making process. There are two types of benefits in imitating the decisions of others: the direct-benefit effect and the informational effect. Direct-benefit effect takes place when one’s payoff from her own action is directly affected by other people’s actions. This phenomenon is also called as network effect. It can be illustrated by the historical adoption rates of fax machine, which quickly peaked after it slowly reached a tipping point [1]. Informational effect occurs when collective information is more powerful than one’s private information. In a setting where one has limited information on which action to prefer, the decision is more likely to be made by mimicking others’ decisions. On the other hand, one does not always have the control over whether to be influenced or not as influence can also happen at an unconscious level. Social conformity, mirroring and other psychological effects are examples of such situations [2].

The earliest studies about influence propagation in social networks took place in the middle of the 20th century. In his famous book Rogers [3] brings together a number of studies which study how innovations in agricultural methods and tools spread in the rural communities. He paved the way for the development of notions such as strength of weak ties, tipping point, phases of adoption, and categories of technology adopters.

Given the importance of and opportunities in social influence, marketers try to take advantage of it in order to increase their market recognition and adoption of their products. For companies, a well-planned, calculated and targeted viral marketing in the form of an “influencer marketing campaign” can trigger a cascading positive word-of-mouth effect [4]. Ideally, subsidizing a few influential people to promote a certain brand will create a cascade in the network. Therefore, the problem is to select a set of influentials in such a way where influence spread is maximized while the cost of subsidizing the influentials is kept within a given budget.

As a motivating example, consider an online baby products retailer who wants to advertise over a social network. The main target market for this company is the people who have or expect to have babies. Such people might be characterized by age groups or online behaviour. The target market is further segmented to subgroups, for example with respect to income levels, which carry different customer lifetime values (e.g., expected profit) for the retailer. The retailer sets a budget to promote its products to its target market via influencers who possess varying degrees of self-perceived values and impact in the social network and have different prices for their service. Hence, budget should be spent in an efficient way while selecting the influencers. The retailer aims to maximize its profit while staying within the allowed budget.

Information cascades in social networks can be modelled by employing various diffusion models including Markov random fields [5], voter models [6], Independent Cascade Model (ICM), and Linear Threshold Model (LTM). Most common among the diffusion models in the literature are ICM and LTM.

LTM assumes that diffusion time steps are discrete. At any time, a node can be either active (i.e., influenced) or inactive. A node cannot become inactive later once it is active (i.e., a progressive model). Each node, in a way, contributes to activation of their neighbours. In LTM, each link is assigned a weight wvu representing the influence of node v towards the target node u. Each node has an assigned threshold θu to get activated. The process starts with initially active nodes which serve as the seed nodes. At any time step t, for node u, if sum of influence weights on links originating from neighbouring active nodes exceed the randomly determined threshold θu, then u becomes active. The process runs until the time step where no more nodes get activated.

In ICM [7], on the other hand, node v activated at time t tries to activate its inactive neighbour node u only at time t+1. The attempt is successful with probability pvu. Therefore, ICM is inherently a stochastic process.

Our contribution

In this study, we make the following contributions:

  • We define the new Targeted and Budgeted Influence Maximization in Social Networks under Deterministic Linear Threshold Model problem. This problem differs from the existing studies in the literature by (i) considering a deterministic diffusion model, (ii) extending the original Influence Maximization Problem [7] to a targeted version of the problem where nodes might carry heterogeneous profit values, and to a budgeted version of the problem where nodes might carry heterogeneous cost values for becoming seed nodes.

  • We develop a new algorithm named Targeted and Budgeted Potential Greedy (TABU-PG) for the problem we defined. The algorithm employs a set of alternative methods for node selection and potential gain calculation. Some of the optional methods included in TABU-PG are taken from the literature to serve as benchmarks and the others are novel methods introduced in this work.

  • We propose novel methods to enable TABU-PG heuristics to run on very large networks in a significantly shorter amount of time by trading between spread performance (i.e., total profit) and runtime.

  • We propose new methods for generating influence weights for links; and threshold, profit, and cost values for nodes. In our opinion, in many cases, our methods reflect the real world dynamics more accurately than most widely employed methods in the literature.

  • We provide empirical evaluations of TABU-PG heuristics and benchmarks such as closeness, betweenness, pagerank, strength, authority, hub, eigenvector, and random heuristics. With extensive computational experiments we show how all heuristics perform with 8 different datasets on 4 different real-life networks.

The paper is structured as follows. In Section 2, we review how Influence Maximization Problem emerged and developed in the literature, along with a comparison with our study. In Section 3, we provide a formal definition of the problem, present our TABU-PG algorithm, and describe the dataset generation methods we employ. In Section 4, experimental results and discussion are given. The conclusion and final remarks are given in Section 5.

Section snippets

Related work

Domingos and Richardson [5] popularized the concept of network value of customers. By approaching the market as a set of connected entities rather than independent entities, they shifted the paradigm to considering the extra value which might emerge as a result of influences between entities instead of considering only the intrinsic value of each entity. Their study introduced the fundamental problem of Influence Maximization, that is how to choose seed nodes so that particular influence spread

Formal problem definition

Let G=(V,E) be a directed network where V is the set of nodes with |V|=n nodes, and E is the set of links with |E|=m links. Each node v ∈ V is associated with a threshold value θv, an activation cost for being a seed node cv, and a profit value pv. Each directed link has an influence weight iuv representing the amount of influence node u has on node v. The budget is denoted by B.

At any time step, a node can only be in one of the two states, inactive or active, represented by σv ∈ 0, 1. f(v)

Experimental results

We present the performance of our algorithm with experimental results. An experiment is performed for each generated dataset.4 Experiment 1 and 2 are for Epinions, Experiment 3 and 4 are for Academia, Experiment 5 and 6 are for Inploid, and Experiment 7 and 8 are for Pokec networks.

For each experiment, strength,5

Conclusion

In this paper, we defined the new Targeted and Budgeted Influence Maximization Problem under Deterministic LTM. We extended the original Influence Maximization Problem by allowing different nodes to carry different cost and return values under a Deterministic LTM. This makes it possible to model different real-world Influence Maximization problems depending on how the return values are generated; assigning values based on estimated profits would make it a profit maximization problem whereas

Acknowledgement

F. Gursoy is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under 2210-A Program.

References (43)

  • F. Wang et al.

    On positive influence dominating sets in social networks

    Theor. Comput. Sci.

    (2011)
  • M. Gladwell

    The Tipping Point: How Little Things Can Make a Big Difference

    (2006)
  • D. Kahneman

    Thinking, Fast and Slow

    (2011)
  • E.M. Rogers

    Diffusion of Innovations

    (1962)
  • R. Ferguson

    Word of mouth and viral marketing: taking the temperature of the hottest trends in marketing

    J. Consumer Marketing

    (2008)
  • P. Domingos et al.

    Mining the network value of customers

    Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2001)
  • E. Even-Dar et al.

    A Note on Maximizing the Spread of Influence in Social Networks

  • D. Kempe et al.

    Maximizing the spread of influence through a social network

    Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2003)
  • A. Goyal et al.

    Simpath: an efficient algorithm for influence maximization under the linear threshold model

    Data Mining (ICDM), 2011 IEEE 11th International Conference on

    (2011)
  • W. Chen et al.

    Scalable influence maximization in social networks under the linear threshold model

    Data Mining (ICDM), 2010 IEEE 10th International Conference on

    (2010)
  • J. Leskovec et al.

    Cost-effective outbreak detection in networks

    Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2007)
  • A. Goyal et al.

    Celf++: optimizing the greedy algorithm for influence maximization in social networks

    Proceedings of the 20th International Conference Companion on World Wide Web

    (2011)
  • C. Zhou et al.

    Ublf: An upper bound based approach to discover influential nodes in social networks

    Data Mining (ICDM), 2013 IEEE 13th International Conference on

    (2013)
  • W. Chen et al.

    Efficient influence maximization in social networks

    Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2009)
  • W. Chen et al.

    Scalable influence maximization for prevalent viral marketing in large-scale social networks

    Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2010)
  • F. Li et al.

    Labeled influence maximization in social networks for target marketing

    Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on

    (2011)
  • W. Lu et al.

    Profit maximization over social networks

    Data Mining (ICDM), 2012 IEEE 12th International Conference on

    (2012)
  • Y. Li et al.

    Real-time targeted influence maximization for online advertisements

    Proceedings of the VLDB Endowment

    (2015)
  • J. Lee et al.

    A query approach for influence maximization on specific users in social networks

    IEEE Trans.Knowl. Data Eng.

    (2015)
  • C. Song et al.

    Targeted influence maximization in social networks

    Proceedings of the 25th ACM International Conference on Information and Knowledge Management

    (2016)
  • H. Nguyen et al.

    On budgeted influence maximization in social networks

    IEEE J. Sel. Areas Commun.

    (2013)
  • Cited by (36)

    • Structural characteristics and disruption ripple effect in a meso-level electric vehicle Lithium-ion battery supply chain network

      2023, Resources Policy
      Citation Excerpt :

      Son et al. (2021) illustrated Japan's change in its supply chain network structure after the disruption caused by the 2011 Tohoku earthquake and tsunami. IP-based models are widely used in social network analysis, such as knowledge diffusion (Robin and Nicolas, 2007) and influence diffusion (Gursoy and Gunnec, 2018; Riquelme et al., 2018). Due to the simplicity and effectiveness of IP models, they have been applied in research related to supply chain network risk propagation.

    • Optimization of constraint engineering problems using robust universal learning chimp optimization

      2022, Advanced Engineering Informatics
      Citation Excerpt :

      As problems become more complex and unknown to the solver, the necessity for more powerful optimization methods has grown [16–18]. Numerous deterministic approaches have been developed during the last several decades to solve a wide variety of optimization problems [19,20]. However, deterministic models require knowledge of the optimization problem's features and the gradient information [21].

    • Trade structure and risk transmission in the international automotive Li-ion batteries trade

      2021, Resources, Conservation and Recycling
      Citation Excerpt :

      As a type of epidemic spread model, the rules of the IC model are based on (independent) edgewise decisions. In contrast, the LT model assumes the node is activated once a certain threshold is exceeded (Gursoy and Gunnec, 2018). Due to the cascading nature of the LT model and its simplicity, it is widely used to uncover the complexity of economic networks, such as financial networks (Kobayashi, 2014).

    View all citing articles on Scopus
    View full text