Efficiently summarizing attributed diffusion networks

Amiri, Sorour E.; Chen, Liangzhe; Prakash, B. Aditya

doi:10.1007/s10618-018-0572-z

Efficiently summarizing attributed diffusion networks

Published: 18 May 2018

Volume 32, pages 1251–1274, (2018)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

613 Accesses
3 Citations
Explore all metrics

Abstract

Given a large attributed social network, can we find a compact, diffusion-equivalent representation while keeping the attribute properties? Diffusion networks with user attributes such as friendship, email communication, and people contact networks are increasingly common-place in the real-world. However, analyzing them is challenging due to their large size. In this paper, we first formally formulate a novel problem of summarizing an attributed diffusion graph to preserve its attributes and influence-based properties. Next, we propose ANeTS, an effective sub-quadratic parallelizable algorithm to solve this problem: it finds the best set of candidate nodes and merges them to construct a smaller network of ‘super-nodes’ preserving the desired properties. Extensive experiments on diverse real-world datasets show that ANeTS outperforms all state-of-the-art baselines (some of which do not even finish in 14 days). Finally, we show how ANeTS helps in multiple applications such as Topic-Aware viral marketing and sense-making of diverse graphs from different domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akoglu L, Tong H, Meeder B, Faloutsos C (2012) Pics: parameter-free identification of cohesive subgroups in large attributed graphs. In: Proceedings of the 2012 SIAM international conference on data mining. SIAM, pp 439–450
Anderson RM, May RM, Anderson B (1992) Infectious diseases of humans: dynamics and control, vol 28. Wiley Online Library, Oxford, UK
Google Scholar
Barbieri N, Bonchi F, Manco G (2012) Topic-aware social influence propagation models. In: Data mining (ICDM), 2012 IEEE 12th international conference on. IEEE, pp 81–90
Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 199–208
Chen S, Fan J, Li G, Feng J, Tan K-L, Tang J (2015) Online topic-aware influence maximization. Proc VLDB Endow 8(6):666–677
Article Google Scholar
Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 551–556
Fan W, Li J, Wang X, Wu Y (2012) Query preserving graph compression. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data. ACM, pp 157–168
Ghosh A, Boyd S (2006) Growing well-connected graphs. In: Decision and control, 2006 45th IEEE conference on. IEEE, pp 6605–6611
Gomez-Rodriguez M, Leskovec J, Krause A (2012) Inferring networks of diffusion and influence, vol 5. ACM, New York, p 21
Google Scholar
Günnemann S, Boden B, Seidl T (2011) DB-CSC: a density-based approach for subspace clustering in graphs with feature vectors. Springer, Berlin, pp 565–580
Google Scholar
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666
Article Google Scholar
Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Article MATH Google Scholar
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Article MathSciNet MATH Google Scholar
Khan A, Bhowmick SS, Bonchi F (2017) Summarizing static and dynamic big graphs. Proc VLDB Endow 10(12):1981–1984
Article Google Scholar
Kloumann IM, Kleinberg JM (2014) Community membership identification from small seed sets. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1366–1375
Liu Y, Dighe A, Safavi T, Koutra D (2016) A graph summarization: a survey. arXiv preprint arXiv:1612.04883
Mathioudakis M, Bonchi F, Castillo C, Gionis A, Ukkonen A (2011) Sparsification of influence networks. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 529–537
Navlakha S, Rastogi R, Shrivastava N (2008) Graph summarization with bounded error. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 419–432
Perozzi B, Akoglu L, Iglesias Sánchez P, Müller E (2014) Focused clustering and outlier detection in large attributed graphs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1346–1355
Prakash BA, Chakrabarti D, Valler NC, Faloutsos M, Faloutsos C (2011) Threshold conditions for arbitrary cascade models on arbitrary networks. ICDM, Vancouver, Canada
Book Google Scholar
Purohit M, Prakash BA, Kang C, Zhang Y, Subrahmanian V (2014) Fast influence-based coarsening for large networks. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1296–1305
Qu Q, Liu S, Jensen CS, Zhu F, Faloutsos C (2014) Interestingness-driven diffusion process summarization in dynamic networks. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 597–613
Ruan Y, Fuhry D, Parthasarathy S (2013) Efficient community detection in large networks using content and links. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 1089–1098
Seah B-S, Bhowmick SS, Dewey CF, Yu H (2012) Fuse: a profit maximization approach for functional summarization of biological networks. BMC Bioinform 13(3):S10
Article Google Scholar
Sen P, Namata GM, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–106
Article Google Scholar
Shi L, Tong H, Tang J, Lin C (2015) Vegas: visual influence graph summarization on citation networks. IEEE Trans Knowl Data Eng 27(12):3417–3431
Article Google Scholar
Tian Y, Hankins RA, Patel JM (2008) Efficient aggregation for graph summarization. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 567–580
Toivonen H, Zhou F, Hartikainen A, Hinkka A (2011) Compression of weighted graphs. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 965–973
Wu Y, Zhong Z, Xiong W, Jing N (2014) Graph summarization for attributed graphs. In: Information Science, Electronics and Electrical Engineering (ISEEE), 2014 international conference on, vol 1. IEEE, pp 503–507
Xu Z, Ke Y, Wang Y, Cheng H, Cheng J (2012) A model-based approach to attributed graph clustering. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data. ACM, pp 505–516
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213
Article Google Scholar
Yang J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes. In: Data mining (ICDM), 2013 IEEE 13th international conference on. IEEE, pp 1151–1156
Zhang H, Yao DD, Ramakrishnan N (2014) Detection of stealthy malware activities with traffic causality and scalable triggering relation discovery. In: Proceedings of the 9th ACM symposium on information, computer and communications security. ACM, pp 39–50
Zhang H, Sun M, Yao DD, North C (2015) Visualizing traffic causality for analyzing network anomalies. In: Proceedings of the 2015 ACM international workshop on international workshop on security and privacy analytics. ACM, pp 37–42
Zhou Y, Cheng H, Yu JX (2010) Clustering large attributed graphs: an efficient incremental approach. In: Data mining (ICDM), 2010 IEEE 10th international conference on. IEEE, pp 689–698

Download references

Author information

Authors and Affiliations

Department of Computer Science, Virginia Tech, Blacksburg, USA
Sorour E. Amiri, Liangzhe Chen & B. Aditya Prakash

Authors

Sorour E. Amiri
View author publications
You can also search for this author in PubMed Google Scholar
Liangzhe Chen
View author publications
You can also search for this author in PubMed Google Scholar
B. Aditya Prakash
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sorour E. Amiri.

Additional information

Responsible editor: Jesse Davis, Elisa Fromont, Derek Greene, and Björn Bringmann.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1307 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Amiri, S.E., Chen, L. & Prakash, B.A. Efficiently summarizing attributed diffusion networks. Data Min Knowl Disc 32, 1251–1274 (2018). https://doi.org/10.1007/s10618-018-0572-z

Download citation

Received: 10 December 2017
Accepted: 11 May 2018
Published: 18 May 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10618-018-0572-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficiently summarizing attributed diffusion networks

Abstract

Access this article

Similar content being viewed by others

Interestingness-Driven Diffusion Process Summarization in Dynamic Networks

Robust keyword search in large attributed graphs

Dynamic graph summarization: a tensor decomposition approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 1307 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficiently summarizing attributed diffusion networks

Abstract

Access this article

Similar content being viewed by others

Interestingness-Driven Diffusion Process Summarization in Dynamic Networks

Robust keyword search in large attributed graphs

Dynamic graph summarization: a tensor decomposition approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 1307 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation