Abstract
We study the problem of simplifying a given directed graph by keeping a small subset of its arcs. Our goal is to maintain the connectivity required to explain a set of observed traces of information propagation across the graph. Unlike previous work, we do not make any assumption about an underlying model of information propagation. Instead, we approach the task as a combinatorial problem. We prove that the resulting optimization problem is \(\mathbf{NP}\)-hard. We show that a standard greedy algorithm performs very well in practice, even though it does not have theoretical guarantees. Additionally, if the activity traces have a tree structure, we show that the objective function is supermodular, and experimentally verify that the approach for size-constrained submodular minimization recently proposed by Nagano et al. (28th International Conference on Machine Learning, 2011) produces very good results. Moreover, when applied to the task of reconstructing an unobserved graph, our methods perform comparably to a state-of-the-art algorithm devised specifically for this task.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Yahoo! Meme was a microblogging service that was discontinued on May 25, 2012.
References
Arenas A, Duch J, Fernández A, Gómez S (2007) Size reduction of complex networks preserving modularity. New J Phys 9(6):176
Edmonds J (2003) Submodular functions, matroids, and certain polyhedra. In: Combinatorial optimization—Eureka, You Shrink!, Springer, Berlin, pp 11–26
Elkin M, Peleg D (2005) Approximating \(k\)-spanner problems for \(k {\>} 2\). Theor Comput Sci 337(1):249–277
Foti NJ, Hughes JM, Rockmore DN (2011) Nonparametric sparsification of complex multiscale networks. PLoS One 6(2):e16431
Fujishige S (2005) Submodular functions and optimization, vol 58. Elsevier Science, Amsterdam
Fung WS, Hariharan R, Harvey NJ, Panigrahi D (2011) A general framework for graph sparsification. In: Proceedings of the 43rd annual ACM symposium on theory of computing, ACM, pp 71–80
Gomez-Rodriguez M, Leskovec J, Krause A (2010) Inferring networks of diffusion and influence. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 1019–1028
Gomez-Rodriguez M, Balduzzi D, Schölkopf B (2011) Uncovering the temporal dynamics of diffusion networks. In: Proceedings of the 28th international conference on machine learning, pp 561–568
Iwata S, Orlin JB (2009) A simple combinatorial algorithm for submodular function minimization. In: Proceedings of the twentieth Annual ACM-SIAM symposium on discrete algorithms, society for industrial and applied mathematics, pp 1230–1237
Jamali M, Ester M (2010) Modeling and comparing the influence of neighbors on the behavior of users in social and similarity networks. In: 2010 IEEE international conference on data mining workshops (ICDMW), IEEE, pp 336–343
Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 137–146
Krause A (2010) Sfo: a toolbox for submodular function optimization. J Mach Learn Res 11:1141–1144
Leskovec J, Faloutsos C (2007) Scalable modeling of real graphs using kronecker multiplication. In: Proceedings of the 24th international conference on machine learning, ACM, pp 497–504
Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 497–506
Mathioudakis M, Bonchi F, Castillo C, Gionis A, Ukkonen A (2011) Sparsification of influence networks. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 529–537
Misiołek E, Chen DZ (2006) Two flow network simplification algorithms. Inf Process Let 97(5):197–202
Nagano K, Kawahara Y, Aihara K (2011) Size-constrained submodular minimization through minimum norm base. In: Proceedings of the 28th international conference on machine learning, pp 977–984
Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions-I. Math Progr 14(1):265–294
Peleg D, Schäffer AA (1989) Graph spanners. J Graph Theory 13(1):99–116
Quirin A, Cordon O, Santamaria J, Vargas-Quesada B, Moya-Anegón F (2008) A new variant of the pathfinder algorithm to generate large visual science maps in cubic time. Inf Process Manag 44(4):1611–1623
Serrano E, Quirin A, Botia J, Cordón O (2010) Debugging complex software systems by means of pathfinder networks. Inf Sci 180(5):561–583
Serrano MÁ, Boguñá M, Vespignani A (2009) Extracting the multiscale backbone of complex weighted networks. Proc Nat Acad Sci USA 106(16):6483–6488
Srikant R, Yang Y (2001) Mining web logs to improve website organization. In: Proceedings of the 10th international conference on World Wide Web, ACM, pp 430–437
Svitkina Z, Fleischer L (2011) Submodular approximation: sampling-based algorithms and lower bounds. SIAM J Comput 40(6):1715–1737
Toivonen H, Mahler S, Zhou F (2010) A framework for path-oriented network simplification. In: Advances in intelligent data analysis IX, Springer, Berlin, pp 220–231
Wolfe P (1976) Finding the nearest point in a polytope. Math Progr 11(1):128–149
Zhou F, Malher S, Toivonen H (2010) Network simplification with minimal loss of connectivity. In: Data Mining (ICDM), 2010 IEEE 10th international conference on IEEE, pp 659–668
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, Filip Zelezny.
Rights and permissions
About this article
Cite this article
Bonchi, F., De Francisci Morales, G., Gionis, A. et al. Activity preserving graph simplification. Data Min Knowl Disc 27, 321–343 (2013). https://doi.org/10.1007/s10618-013-0328-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-013-0328-8