Influence maximization in social networks under Deterministic Linear Threshold Model
Introduction
Social process of influence intensely and frequently takes place among people. As a result, people’s decisions and behaviours are influenced by others. Such influence can be observed in marketing, consumer behaviour, politics, persuasion, peer pressure, conformity, and leadership.
Furthermore, influence may occur at a conscious level or at an unconscious level. At a conscious level, people choose whether to be influenced or not as a result of some rational decision making process. There are two types of benefits in imitating the decisions of others: the direct-benefit effect and the informational effect. Direct-benefit effect takes place when one’s payoff from her own action is directly affected by other people’s actions. This phenomenon is also called as network effect. It can be illustrated by the historical adoption rates of fax machine, which quickly peaked after it slowly reached a tipping point [1]. Informational effect occurs when collective information is more powerful than one’s private information. In a setting where one has limited information on which action to prefer, the decision is more likely to be made by mimicking others’ decisions. On the other hand, one does not always have the control over whether to be influenced or not as influence can also happen at an unconscious level. Social conformity, mirroring and other psychological effects are examples of such situations [2].
The earliest studies about influence propagation in social networks took place in the middle of the 20th century. In his famous book Rogers [3] brings together a number of studies which study how innovations in agricultural methods and tools spread in the rural communities. He paved the way for the development of notions such as strength of weak ties, tipping point, phases of adoption, and categories of technology adopters.
Given the importance of and opportunities in social influence, marketers try to take advantage of it in order to increase their market recognition and adoption of their products. For companies, a well-planned, calculated and targeted viral marketing in the form of an “influencer marketing campaign” can trigger a cascading positive word-of-mouth effect [4]. Ideally, subsidizing a few influential people to promote a certain brand will create a cascade in the network. Therefore, the problem is to select a set of influentials in such a way where influence spread is maximized while the cost of subsidizing the influentials is kept within a given budget.
As a motivating example, consider an online baby products retailer who wants to advertise over a social network. The main target market for this company is the people who have or expect to have babies. Such people might be characterized by age groups or online behaviour. The target market is further segmented to subgroups, for example with respect to income levels, which carry different customer lifetime values (e.g., expected profit) for the retailer. The retailer sets a budget to promote its products to its target market via influencers who possess varying degrees of self-perceived values and impact in the social network and have different prices for their service. Hence, budget should be spent in an efficient way while selecting the influencers. The retailer aims to maximize its profit while staying within the allowed budget.
Information cascades in social networks can be modelled by employing various diffusion models including Markov random fields [5], voter models [6], Independent Cascade Model (ICM), and Linear Threshold Model (LTM). Most common among the diffusion models in the literature are ICM and LTM.
LTM assumes that diffusion time steps are discrete. At any time, a node can be either active (i.e., influenced) or inactive. A node cannot become inactive later once it is active (i.e., a progressive model). Each node, in a way, contributes to activation of their neighbours. In LTM, each link is assigned a weight wvu representing the influence of node v towards the target node u. Each node has an assigned threshold θu to get activated. The process starts with initially active nodes which serve as the seed nodes. At any time step t, for node u, if sum of influence weights on links originating from neighbouring active nodes exceed the randomly determined threshold θu, then u becomes active. The process runs until the time step where no more nodes get activated.
In ICM [7], on the other hand, node v activated at time t tries to activate its inactive neighbour node u only at time . The attempt is successful with probability pvu. Therefore, ICM is inherently a stochastic process.
Our contribution
In this study, we make the following contributions:
- •
We define the new Targeted and Budgeted Influence Maximization in Social Networks under Deterministic Linear Threshold Model problem. This problem differs from the existing studies in the literature by (i) considering a deterministic diffusion model, (ii) extending the original Influence Maximization Problem [7] to a targeted version of the problem where nodes might carry heterogeneous profit values, and to a budgeted version of the problem where nodes might carry heterogeneous cost values for becoming seed nodes.
- •
We develop a new algorithm named Targeted and Budgeted Potential Greedy (TABU-PG) for the problem we defined. The algorithm employs a set of alternative methods for node selection and potential gain calculation. Some of the optional methods included in TABU-PG are taken from the literature to serve as benchmarks and the others are novel methods introduced in this work.
- •
We propose novel methods to enable TABU-PG heuristics to run on very large networks in a significantly shorter amount of time by trading between spread performance (i.e., total profit) and runtime.
- •
We propose new methods for generating influence weights for links; and threshold, profit, and cost values for nodes. In our opinion, in many cases, our methods reflect the real world dynamics more accurately than most widely employed methods in the literature.
- •
We provide empirical evaluations of TABU-PG heuristics and benchmarks such as closeness, betweenness, pagerank, strength, authority, hub, eigenvector, and random heuristics. With extensive computational experiments we show how all heuristics perform with 8 different datasets on 4 different real-life networks.
The paper is structured as follows. In Section 2, we review how Influence Maximization Problem emerged and developed in the literature, along with a comparison with our study. In Section 3, we provide a formal definition of the problem, present our TABU-PG algorithm, and describe the dataset generation methods we employ. In Section 4, experimental results and discussion are given. The conclusion and final remarks are given in Section 5.
Section snippets
Related work
Domingos and Richardson [5] popularized the concept of network value of customers. By approaching the market as a set of connected entities rather than independent entities, they shifted the paradigm to considering the extra value which might emerge as a result of influences between entities instead of considering only the intrinsic value of each entity. Their study introduced the fundamental problem of Influence Maximization, that is how to choose seed nodes so that particular influence spread
Formal problem definition
Let be a directed network where V is the set of nodes with nodes, and E is the set of links with links. Each node v ∈ V is associated with a threshold value θv, an activation cost for being a seed node cv, and a profit value pv. Each directed link has an influence weight iuv representing the amount of influence node u has on node v. The budget is denoted by B.
At any time step, a node can only be in one of the two states, inactive or active, represented by σv ∈ 0, 1. f(v)
Experimental results
We present the performance of our algorithm with experimental results. An experiment is performed for each generated dataset.4 Experiment 1 and 2 are for Epinions, Experiment 3 and 4 are for Academia, Experiment 5 and 6 are for Inploid, and Experiment 7 and 8 are for Pokec networks.
For each experiment, strength,5
Conclusion
In this paper, we defined the new Targeted and Budgeted Influence Maximization Problem under Deterministic LTM. We extended the original Influence Maximization Problem by allowing different nodes to carry different cost and return values under a Deterministic LTM. This makes it possible to model different real-world Influence Maximization problems depending on how the return values are generated; assigning values based on estimated profits would make it a profit maximization problem whereas
Acknowledgement
F. Gursoy is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under 2210-A Program.
References (43)
- et al.
On positive influence dominating sets in social networks
Theor. Comput. Sci.
(2011) The Tipping Point: How Little Things Can Make a Big Difference
(2006)Thinking, Fast and Slow
(2011)Diffusion of Innovations
(1962)Word of mouth and viral marketing: taking the temperature of the hottest trends in marketing
J. Consumer Marketing
(2008)- et al.
Mining the network value of customers
Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
(2001) - et al.
A Note on Maximizing the Spread of Influence in Social Networks
- et al.
Maximizing the spread of influence through a social network
Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
(2003) - et al.
Simpath: an efficient algorithm for influence maximization under the linear threshold model
Data Mining (ICDM), 2011 IEEE 11th International Conference on
(2011) - et al.
Scalable influence maximization in social networks under the linear threshold model
Data Mining (ICDM), 2010 IEEE 10th International Conference on
(2010)
Cost-effective outbreak detection in networks
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Celf++: optimizing the greedy algorithm for influence maximization in social networks
Proceedings of the 20th International Conference Companion on World Wide Web
Ublf: An upper bound based approach to discover influential nodes in social networks
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Efficient influence maximization in social networks
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Scalable influence maximization for prevalent viral marketing in large-scale social networks
Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Labeled influence maximization in social networks for target marketing
Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third International Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on
Profit maximization over social networks
Data Mining (ICDM), 2012 IEEE 12th International Conference on
Real-time targeted influence maximization for online advertisements
Proceedings of the VLDB Endowment
A query approach for influence maximization on specific users in social networks
IEEE Trans.Knowl. Data Eng.
Targeted influence maximization in social networks
Proceedings of the 25th ACM International Conference on Information and Knowledge Management
On budgeted influence maximization in social networks
IEEE J. Sel. Areas Commun.
Cited by (36)
Structure, robustness and supply risk in the global wind turbine trade network
2023, Renewable and Sustainable Energy ReviewsStructural characteristics and disruption ripple effect in a meso-level electric vehicle Lithium-ion battery supply chain network
2023, Resources PolicyCitation Excerpt :Son et al. (2021) illustrated Japan's change in its supply chain network structure after the disruption caused by the 2011 Tohoku earthquake and tsunami. IP-based models are widely used in social network analysis, such as knowledge diffusion (Robin and Nicolas, 2007) and influence diffusion (Gursoy and Gunnec, 2018; Riquelme et al., 2018). Due to the simplicity and effectiveness of IP models, they have been applied in research related to supply chain network risk propagation.
Optimization of constraint engineering problems using robust universal learning chimp optimization
2022, Advanced Engineering InformaticsCitation Excerpt :As problems become more complex and unknown to the solver, the necessity for more powerful optimization methods has grown [16–18]. Numerous deterministic approaches have been developed during the last several decades to solve a wide variety of optimization problems [19,20]. However, deterministic models require knowledge of the optimization problem's features and the gradient information [21].
Trade structure and risk transmission in the international automotive Li-ion batteries trade
2021, Resources, Conservation and RecyclingCitation Excerpt :As a type of epidemic spread model, the rules of the IC model are based on (independent) edgewise decisions. In contrast, the LT model assumes the node is activated once a certain threshold is exceeded (Gursoy and Gunnec, 2018). Due to the cascading nature of the LT model and its simplicity, it is widely used to uncover the complexity of economic networks, such as financial networks (Kobayashi, 2014).
Positive opinion maximization in signed social networks
2021, Information Sciences