Model-free inference of diffusion networks using RKHS embeddings

Hu, Shoubo; Cautis, Bogdan; Chen, Zhitang; Chan, Laiwan; Geng, Yanhui; He, Xiuqiang

doi:10.1007/s10618-018-00611-1

Model-free inference of diffusion networks using RKHS embeddings

Published: 17 January 2019

Volume 33, pages 499–525, (2019)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Shoubo Hu¹,
Bogdan Cautis ORCID: orcid.org/0000-0003-3497-042X²,
Zhitang Chen³,
Laiwan Chan¹,
Yanhui Geng³ &
…
Xiuqiang He³

744 Accesses
Explore all metrics

Abstract

We revisit in this paper the problem of inferring a diffusion network from information cascades. In our study, we make no assumptions on the underlying diffusion model, in this way obtaining a generic method with broader practical applicability. Our approach exploits the pairwise adoption-time intervals from cascades. Starting from the observation that different kinds of information spread differently, these time intervals are interpreted as samples drawn from unknown (conditional) distributions. In order to statistically distinguish them, we propose a novel method using Reproducing Kernel Hilbert Space embeddings. Experiments on both synthetic and real-world data from Twitter and Flixster show that our method significantly outperforms the state-of-the-art methods. We argue that our algorithm can be implemented by parallel batch processing, in this way meeting the needs in terms of efficiency and scalability of real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

REFINE: Representation Learning from Diffusion Events

A continuous-time diffusion model for inferring multi-layer diffusion networks

Article 24 June 2024

RNe2Vec: information diffusion popularity prediction based on repost network embedding

Article 30 October 2020

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Notes

E.g., we set 20 as a minimal support threshold in our experiments, since inference on candidate edges with very few adoption time intervals would have little statistical significance.
Superscript (u, v) on $y_{c}$ is omitted as it remains the same for all time intervals in $D^{(i)}$.
To support this intuition, we also show in Fig. 1 the distribution of adoption time intervals conditioned by movie popularity in our Flixster experimental dataset, for connected (blue) or unconnected (red) user pairs. For similar empirical evidence we also refer the reader to Du et al. (2013).
http://snap.stanford.edu/.
We are grateful to the authors for the binary package of NPDC and one synthetic dataset.
https://github.com/amber0309/KEBC.
The proportion of pairs with edge and without edge in these batches should be very similar to the one of the entire dataset.
http://imdbpy.sourceforge.net.

References

Arthur D, Vassilvitskii S (2007) K-means++: The advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, SODA ’07, pp 1027–1035, Philadelphia. Society for Industrial and Applied Mathematics
Chen W, Wang Y, Yuan Y, Wang Q (2016) Combinatorial multi-armed bandit and its extension to probabilistically triggered arms. J Mach Learn Res 17(50):1–33
MathSciNet MATH Google Scholar
Chen Z, Zhang K, Chan L, Schlkopf B (2014) Causal discovery via reproducing kernel hilbert space embeddings. Neural Comput 26(7):1484–1517 PMID: 24708374
Article MathSciNet Google Scholar
Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04, pp 551–556. ACM, New York
Du N, Song L, Woo H, Zha H (2013) Uncover topic-sensitive information diffusion networks. In: Carvalho CM, Ravikumar P (eds), Proceedings of the sixteenth international conference on artificial intelligence and statistics, vol 31 of proceedings of machine learning research, pp 229–237. PMLR, Scottsdale, Arizona
Du N, Song L, Yuan M, Smola AJ (2012) Learning networks of heterogeneous influence. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25. Curran Associates Inc, pp 2780–2788
Easley D, Kleinberg J (2010) Networks, crowds, and markets: reasoning about a highly connected world. Cambridge University Press, New York
Book MATH Google Scholar
Fraley RC, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97:611–631
Article MathSciNet MATH Google Scholar
Gomez-Rodriguez M, Balduzzi D, Schölkopf B (2011) Uncovering the temporal dynamics of diffusion networks. In: Proceedings of the 28th international conference on international conference on machine learning, ICML’11, pp 561–568. Omnipress, USA
Gomez-Rodriguez M, Leskovec J, Krause A (2012) Inferring networks of diffusion and influence. ACM Trans Knowl Discov Data 5(4):21:1–21:37
Article Google Scholar
Gomez-Rodriguez M, Leskovec J, Schölkopf B (2013) Structure and dynamics of information pathways in online media. In: Proceedings of the sixth ACM international conference on web search and data mining, WSDM ’13, pp 23–32. ACM, New York
Gomez-Rodriguez M, Schölkopf B (2012) Influence maximization in continuous time diffusion networks. In: Proceedings of the 29th international conference on international conference on machine learning, ICML’12, pp 579–586. Omnipress, USA
Gomez-Rodriguez M, Song L, Du N, Zha H, Schölkopf B (2016) Influence estimation and maximization in continuous-time diffusion networks. ACM Trans Inf Syst 34(2):9:1–9:33
Article Google Scholar
Goyal A, Bonchi F, Lakshmanan LV (2010) Learning influence probabilities in social networks. In: Proceedings of the third ACM international conference on web search and data mining, WSDM ’10, pp 241–250, ACM, New York
Grabowicz PA, Ganguly N, Gummadi KP (2016) Distinguishing between topical and non-topical information diffusion mechanisms in social media. In: Proceedings of the 10th international conference on web and social media, pp 151–160
Gretton A, Borgwardt KM, Rasch MJ, Schölkopf B, Smola A (2012) A kernel two-sample test. J Mach Learn Res 13(1):723–773
MathSciNet MATH Google Scholar
Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the fourth ACM conference on recommender systems, RecSys ’10, pp 135–142. ACM, New York
Jegelka S, Gretton A, Schölkopf B, Sriperumbudur BK, von Luxburg U (2009) Generalized clustering via kernel embeddings. In: Mertsching B, Hund M, Aziz Z (eds) KI 2009: advances in artificial intelligence. Springer, Berlin, pp 144–152
Chapter Google Scholar
Kempe D, Kleinberg J, Tardos E (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’03, pp 137–146. ACM, New York
Lei S, Maniu S, Mo L, Cheng R, Senellart P (2015) Online influence maximization. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’15, pp 645–654. ACM, New York
McLachlan G, Peel D (2004) Finite mixture models. Wiley series in probability and statistics: applied probability and statistics. Wiley, London
Muandet K, Fukumizu K, Sriperumbudur B, Schölkopf B (2017) Kernel mean embedding of distributions: a review and beyond. Foundations Trends Mach Learn 10(1–2):1–141
Article MATH Google Scholar
Myers S, Leskovec J (2010) On the convexity of latent social network inference. In: Advances in neural information processing systems 23, pp 741–1749. Curran Associates, Inc
Rahimi A, Recht B (2008) Random features for large-scale kernel machines. In: Advances in neural information processing systems 20, pp 1177–1184. Curran Associates, Inc
Romero DM, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: Idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th international conference on world wide web, WWW ’11, pp 695–704. ACM, New York
Rong Y, Zhu Q, Cheng H (2016) A model-free approach to infer the diffusion network from event cascade. In: Proceedings of the 25th ACM international on conference on information and knowledge management, CIKM ’16, pp 1653–1662. ACM, New York
Rudin W (2017) Fourier analysis on groups. Dover books on mathematics. Dover Publications, NY
Google Scholar
Saito K, Nakano R, Kimura M (2008) Prediction of information diffusion probabilities for independent cascade model. In: Proceedings of the 12th international conference on knowledge-based intelligent information and engineering systems, Part III, KES ’08, pp 67–75. Springer, Berlin
Schölkopf B, Smola AJ (2001) Learning with Kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge
Google Scholar
Shirazi S, Harandi MT, Sanderson C, Alavi A, Lovell BC (2012) Clustering on grassmann manifolds via kernel embedding with application to action analysis. In: 2012 19th IEEE international conference on image processing, pp 781–784
Smola A, Gretton A, Song L, Schölkopf B (2007) A hilbert space embedding for distributions. In: Hutter M, Servedio RA, Takimoto E (eds) Algorithmic learning theory. Springer, Berlin, pp 13–31
Chapter Google Scholar
Song L, Fukumizu K, Gretton A (2013) Kernel embeddings of conditional distributions: a unified kernel framework for nonparametric inference in graphical models. IEEE Signal Process Mag 30(4):98–111
Article Google Scholar
Song L, Huang J, Smola A, Fukumizu K (2009) Hilbert space embeddings of conditional distributions with applications to dynamical systems. In: Proceedings of the 26th annual international conference on machine learning, ICML ’09, pp 961–968. ACM, New York
Vaswani S, Lakshmanan V, Schmidt M (2015) Influence maximization with bandits. In: NIPS workshop on networks in the social and information sciences
Villmann T, Biehl M, Hammer B, Verleysen M (2009) Similarity-based clustering: recent developments and biomedical applications, vol 5400. Springer, Berlin
Book Google Scholar
Wang H, Shi X, Yeung D-Y (2017) Relational deep learning: a deep latent variable model for link prediction. In: AAAI conference on artificial intelligence, pp 2688–2694
Watts D (2004) Six degrees: the science of a connected age. W. W. Norton
Watts DJ, Dodds PS (2007) Influentials, networks, and public opinion formation. J Consum Res 34(4):441–458
Article Google Scholar
Wen Z, Kveton B, Valko M, Vaswani S (2017) Online influence maximization under independent cascade model with semi-bandit feedback. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems 30. Curran Associates Inc, pp 3022–3032

Download references

Author information

Authors and Affiliations

The Chinese University of Hong Kong, Shatin, Hong Kong
Shoubo Hu & Laiwan Chan
University of Paris-Sud, Orsay, France
Bogdan Cautis
Huawei Noah’s Ark Lab, Shatin, Hong Kong
Zhitang Chen, Yanhui Geng & Xiuqiang He

Authors

Shoubo Hu
View author publications
You can also search for this author inPubMed Google Scholar
Bogdan Cautis
View author publications
You can also search for this author inPubMed Google Scholar
Zhitang Chen
View author publications
You can also search for this author inPubMed Google Scholar
Laiwan Chan
View author publications
You can also search for this author inPubMed Google Scholar
Yanhui Geng
View author publications
You can also search for this author inPubMed Google Scholar
Xiuqiang He
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Bogdan Cautis.

Additional information

Responsible editor: Jesse Davis, Elisa Fromont, Derek Greene and Bjørn Bringmann.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, S., Cautis, B., Chen, Z. et al. Model-free inference of diffusion networks using RKHS embeddings. Data Min Knowl Disc 33, 499–525 (2019). https://doi.org/10.1007/s10618-018-00611-1

Download citation

Received: 27 January 2018
Accepted: 14 December 2018
Published: 17 January 2019
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s10618-018-00611-1

Keywords

Part of a collection:

Journal Track of ECML PKDD 2019

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model-free inference of diffusion networks using RKHS embeddings

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

REFINE: Representation Learning from Diffusion Events

A continuous-time diffusion model for inferring multi-layer diffusion networks

RNe2Vec: information diffusion popularity prediction based on repost network embedding

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now