Elsevier

Neurocomputing

Volume 338, 21 April 2019, Pages 92-100
Neurocomputing

Post and repost: A holistic view of budgeted influence maximization

https://doi.org/10.1016/j.neucom.2019.02.010Get rights and content

Abstract

Existing studies on influence maximization (IM) mainly focus on activating a set of influential users (seed nodes). Originated from the seed nodes’ promotion actions (e.g., posting an advertising tweet) on social networks, a large influence spread might be triggered. However, in practice it is usually very expensive to have influential users posting original tweets in a promotional event. In contrast, it will incur much lower costs to have influential users reposting tweets and have ordinary users posting original tweets. Inspired by these observations, in this paper, we consider the Holistic Budgeted Influence Maximization (HBIM) problem, which maximizes the influence spread by deploying the budget to select seed nodes (for posting) and boost nodes (for reposting). To tackle the NP-hardness and non-submodularity of the problem, we devise two efficient algorithms with the data-dependent approximation ratios. Extensive experiments on real social networks demonstrate the efficiency and effectiveness of our proposed algorithms.

Introduction

Social media marketing is drawing increasing attentions for industrial and research communities [1], [2]. By selecting a group of influential users (seed nodes) to post specific tweets such as online comments, product reviews, etc., a large chain of product adoption might be triggered [3], [4]. To make effective marketing strategies in social media, Influence Maximization has become a hot research topic [5], [6]. Existing works mainly focus on selecting the optimal seed nodes to maximize the influence spread, with an underlying assumption that costs for involving different users are equal. In fact, this assumption seldom holds and it is usually more expensive to involve influential users in a promotion event than ordinary users. This difference in costs motivated the research on Budgeted Influence Maximization problem [7]. However, a random cost is used for each node in [7], disregarding the fact that selecting influential users as seed nodes will usually incur expensive cost in social media marketing.

More recently, Lin et al. [8] propose the influence boost model in which a set of nodes are “boosted” so that they are more susceptible to their friends’ influence. By selecting appropriate boost nodes, the influence spread of a given set of initial seed nodes will be increased. In practice, such pattern does exist, e.g., reposting by influential users may boost the spread of a specific tweet.

However, the work of [8] only consider selecting boost nodes to increase the influence spread for a given set of seed nodes, with the equal cost assumption. Actually, the influence boost model can provide a more flexible mechanism for budget allocation with different cost, providing the fact that persuading a user for reposting a tweet usually incurs much lower cost than for posting an original one. Consequently, a better budget allocation can be achieved for influence maximization by involving both seed nodes and boost nodes in selection.

In this paper, we propose a new framework for influence maximization, named as Holistic Budgeted Influence Maximization (HBIM), to explicitly involve both seed and boost nodes in selection. Given the cost of seed/boost nodes, HBIM maximizes the expected influence spread in a social network with the optimal deployment of seed nodes (to post) and boost nodes (to repost) under the budget constraint. By involving both seed nodes and boost nodes in influence spread, HBIM offers more flexibility in budget-based influence maximization. As this is the case for most commercial promotions in social media, we expect our work to have good applicability in real world scenarios.

Nevertheless, the HBIM is NP-hard and computing the expected influence spread for a given budget deployment is #P-hard. Meanwhile, the influence spread in HBIM problem is not submodular, meaning that the greedy algorithm cannot ensure any performance guarantees. To address these problems, we develop two efficient algorithms IMD and IMD-LB for HBIM with data-dependent approximation ratios. Extensive experiments are conducted using real social networks. The experimental results show the efficiency and effectiveness of our proposed algorithms, and demonstrate the superiority of proposed algorithms over compared algorithms. It is worthwhile to summarize our major contributions as follows.

  • 1.

    We propose a new framework of Holistic Budgeted Influence Maximization (HBIM), which explicitly involves both seed and boost nodes selection. This framework may offers more flexibility in real world scenarios.

  • 2.

    We prove the HBIM is NP-hard and computing the expected influence spread for a given budget deployment is #P-hard. Meanwhile, the influence spread in HBIM problem is not submodular, meaning that the greedy algorithm cannot ensure any performance guarantees.

  • 3.

    We develop two efficient algorithms IMD and IMD-LB for HBIM with provable data-dependent approximation ratios.

  • 4.

    We conduct extensive experiments and the experimental results show the efficiency and effectiveness of our proposed algorithms, and demonstrate the superiority of proposed algorithms over compared algorithms.

The rest of this paper is organized as follows. In Section 2, we discuss the related works of this paper. After that, we formally define the HBIM problem and discuss its properties in Section 3. In Section 4, we develop two efficient algorithms for solving HBIM with data-dependent approximation ratios. Extensive experiments using real social networks are shown in Section 5. Conclusions are presented in Section 6. For conveniens, we list the most frequently used symbols in Table 1.

Section snippets

Related works

Domingos and Richardson [9], [10] are the first to study influence maximization problem in social networks and they formulate the problem with a probabilistic framework. Kempe et al. [5] further formulate the problem as a discrete optimization problem, which is widely adopted by subsequent studies. They prove the problem is NP-hard and propose a greedy algorithm to approximately solve it by repeatedly selecting the node that brings the largest marginal influence increase. Following their work,

Problem definition

To present our problem definition, we will start with introducing the independent cascade (IC) model [5] and its extension of influence boosting model.

In the IC model, given a graph G=(V,E), each edge euv ∈ E is associated with a probability puv and each node u ∈ V is initially inactive. During the diffusion process, a newly activated node only has one trial to activate its inactive neighboring nodes with probability puv. The Influence Maximization problem is to find a set S ⊂ V of k seed nodes

Proposed algorithms

Given Theorem 1 and the non-submodularity of the problem, the classical greedy algorithm cannot achieve 11/e approximation. To tackle these problems, in this section, we propose two algorithms for solving HBIM problem with data-dependent approximation by utilizing the Potentially Reverse Reachable graphs (PRR-graph).

Datasets

We use three real social networks1, as listed in Table 2. Epinions is a who-trust-whom online social network of a a general consumer review site Epinions.com. Members of the site can decide whether to “trust” each other. Gowalla is a location-based social networking website where users share their locations by checking-in. The friendship network is undirected and was collected using their public API. Youtube is a video-sharing web site that includes a social

Conclusion

In this work, we present a novel holistic budgeted influence maximization (HBIM) problem that maximizes the influence spread by finding the optimal deployment of seed&boost nodes. We develop two efficient approximation algorithms, IMD and IMD-LB, with data-dependent approximation ratios. Both algorithms are delicate integrations of Potentially Reverse Reachable Graphs, state-of-the-art IM method and greedy selection algorithm. Extensive experiments are conducted on real social networks and the

Acknowledgment

This work is supported by the National Natural Science Foundation of China (Grant no: U1866602) and by a Discovery Grant from the National Science and Engineering Research Council of Canada. It is also partially supported by ByteDance.

Qihao Shi received his B.S. at Nanjing Normal University of China in 2014. He is currently a Ph.D. candidate in the college of computer science at Zhejiang University. His main research topics are social and information networks, algorithmic game theory and Internet economics.

References (36)

  • D. Kempe et al.

    Maximizing the spread of influence through a social network

    Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2003)
  • W. Chen et al.

    Scalable influence maximization for prevalent viral marketing in large-scale social networks

    Proceedings of the Sixteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2010)
  • N. Hallier

    On budgeted influence maximization in social networks

    IEEE J. Sel. Areas Commun.

    (2013)
  • Y. Lin et al.

    Boosting information spread: an algorithmic approach

    Proceedings of the IEEE International Conference on Data Engineering

    (2017)
  • P. Domingos et al.

    Mining the network value of customers

    Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2001)
  • M. Richardson et al.

    Mining knowledge-sharing sites for viral marketing

    Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2002)
  • J. Leskovec et al.

    Cost-effective outbreak detection in networks

    Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2007)
  • W. Chen et al.

    Efficient influence maximization in social networks

    Proceedings of the Fifteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2009)
  • Cited by (19)

    • Distance-aware optimization model for influential nodes identification in social networks with independent cascade diffusion

      2021, Information Sciences
      Citation Excerpt :

      Influential spreaders share the messages better. When the number of spreaders is predefined as k, the Influence Maximization (IM) problem finds k spreaders [28,37,43]. When a node gets a message and is activated, it tries to activate inactive nodes [19].

    • A dynamic algorithm based on cohesive entropy for influence maximization in social networks

      2021, Expert Systems with Applications
      Citation Excerpt :

      At present, much related research involves the problem of influence maximization. For example, the two fundamental propagation models in the field of influence maximization are the independent cascade model and linear threshold model (Kempe & Kleinberg et al., 2003), based on which several other models (Yang, Brenner, & Giua, 2019; Shi, Wang, & Chen, 2019) have been studied. However, most of the existing algorithms have certain limitations; they do not take into account some uncertainties of the diffusion process in real social networks, and ignore the autonomy of users to choose the sharing object.

    View all citing articles on Scopus

    Qihao Shi received his B.S. at Nanjing Normal University of China in 2014. He is currently a Ph.D. candidate in the college of computer science at Zhejiang University. His main research topics are social and information networks, algorithmic game theory and Internet economics.

    Can Wang received the Ph.D. degree and M.S. degree in computer science and B.S. degree in economics from Zhejiang University, in 2009, 2003 and 1995 respectively. His research interests include data mining, machine learning and information retrieval.

    Jiawei Chen received his B.S. at University of electronic and technology of China in 2014. he is currently a Ph.D. candidate in the college of computer science at Zhejiang University. His main research topics are recommendation, graphical model and deep learning.

    Yan Feng received the Ph.D. degree in computer application from Zhejiang University in 2004. She is currently an associate professor in the College of Computer Science at Zhejiang University, China. Her research interests include database, data mining etc.

    Chun Chen is a professor in the College of Computer Science, Zhejiang University. His research interests include data mining, computer vision, computer graphics and embedded technology.

    View full text