Skip to main content

Link and Graph Mining in the Big Data Era

  • Chapter
  • First Online:
Handbook of Big Data Technologies

Abstract

Graphs are a convenient representation for large sets of data, being complex networks, social networks, publication networks, and so on. The growing volume of data modeled as complex networks, e.g. the World Wide Web, and social networks like Twitter, Facebook, has raised a new area of research focused in complex networks mining. In this new multidisciplinary area, it is possible to highlight some important tasks: extraction of statistical properties, community detection, link prediction, among several others. This new approach has been driven largely by the growing availability of computers and communication networks, which allow us to gather and analyze data on a scale far larger than previously possible. In this chapter we will give an overview of several graph mining approach to mine and handle large complex networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://dblp.uni-trier.de.

  2. 2.

    https://perso.uclouvain.be/vincent.blondel/research/louvain.html.

  3. 3.

    https://www.google.com/insidesearch/howsearchworks/algorithms.html.

  4. 4.

    http://newsroom.fb.com/news/2013/01/introducing-graph-search-beta/.

References

  1. L.A. Adamic, E. Adar, Friends and neighbors on the web. Soc. Network. 25(3), 211–230 (2003)

    Article  Google Scholar 

  2. L.A. Adamic, B.A. Huberman A. Barabási, R. Albert, H. Jeong, G. Bianconi, Power-law distribution of the world wide web. Science 287(5461):2115a+ (2000)

    Google Scholar 

  3. C. Aggarwal, K. Subbian, Evolutionary network analysis: a survey. ACM Comput. Surv. 47(1), 10:1–10:36 (2014)

    Article  MATH  Google Scholar 

  4. C. Aggarwal, Y. Xie, P.S. Yu, On Dynamic Link Inference in Heterogeneous Networks, chap. 35, pp. 415–426

    Google Scholar 

  5. N. Ahmed, J. Neville, R.R. Kompella, Network sampling via edge-based node selection with graph induction (2011)

    Google Scholar 

  6. L. Akoglu, M. McGlohon, C. Faloutsos, Oddball: spotting anomalies in weighted graphs, in Advances in Knowledge Discovery and Data Mining, ed. by M.J. Zaki, J.X. Yu, B. Ravindran, V. Pudi (Springer, Heidelberg, 2010), pp. 410–421

    Chapter  Google Scholar 

  7. L. Akoglu, H. Tong, D. Koutra, Graph based anomaly detection and description: a survey. Data Min. Knowl. Discov. 29(3), 626–688 (2015). May

    Article  MathSciNet  Google Scholar 

  8. L. Akoglu, P.O.S. Vaz de Melo, C. Faloutsos, Quantifying reciprocity in large weighted communication networks, in Proceedings of the 16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining - Volume Part II, PAKDD’12 (Springer, Heidelberg, 2012), pp. 85–96

    Google Scholar 

  9. M. Al Hasan, M.J. Zaki, Output space sampling for graph patterns. Proc. VLDB Endow. 2(1), 730–741 (2009)

    Article  Google Scholar 

  10. U. Alon, Network motifs: theory and experimental approaches. Nat. Rev. Genet. 8(6), 450–461 (2007)

    Article  Google Scholar 

  11. D. Andersen, H. Balakrishnan, F. Kaashoek, R. Morris, Resilient overlay networks (ACM, 2001)

    Google Scholar 

  12. Apache Giraph, an iterative graph processing system. http://giraph.apache.org/. Accessed 10 March 2016

  13. A.P. Appel, E.R.H. Junior, Prophet – a link-predictor to learn new rules on nell, in 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), Dec 2011, pp. 917–924

    Google Scholar 

  14. Aster SQL-GR Big Data Parallel Graph Analytics. http://www.teradata.com/SQL-GR-Engine/. Accessed 10 March 2016

  15. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, Z. Ives, DBpedia: a nucleus for a web of open data, in The Semantic Web: 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC 2007, Busan, Korea, 11–15 November 2007. Proceedings (Springer, Heidelberg, 2007), pp. 722–735

    Google Scholar 

  16. T. Aynaud, V.D. Blondel, J.-L. Guillaume, R.Lambiotte, Multilevel local optimization of modularity, in Graph Partitioning (2013), pp. 315–345

    Google Scholar 

  17. L. Backstrom, J. Leskovec, Supervised random walks: predicting and recommending links in social networks, in Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM’11 (ACM, New York, 2011), pp. 635–644

    Google Scholar 

  18. A. Barrat, M. Barthélemy, R. Pastor-Satorras, A. Vespignani, The architecture of complex weighted networks. Proc. National Acad. Sci. 101, 3747–3752 (2004)

    Article  Google Scholar 

  19. M. Barthélemy, A. Barrat, R. Pastor-Satorras, A. Vespignani, Characterization and modeling of weighted networks. Physica A 346, 34–43 (2005)

    Article  Google Scholar 

  20. D.S. Bassett, M.A. Porter, N.F. Wymbs, S.T. Grafton, J.M. Carlson, P.J. Mucha, Robust detection of dynamic community structure in networks. J. Nonlinear Sci. 23(1), 013142 (2013)

    MathSciNet  Google Scholar 

  21. M. Bastian, S. Heymann, M. Jacomy et al., Gephi: an open source software for exploring and manipulating networks. ICWSM 8, 361–362 (2009)

    Google Scholar 

  22. M. Belkin, P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14, 585–591 (2001)

    Google Scholar 

  23. Y. Bengio, A. Courville, P. Vincent, Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  24. M. Berlingerio, M. Coscia, F. Giannotti, A. Monreale, D. Pedreschi, Multidimensional networks: foundations of structural analysis. World Wide Web 16(5), 567–593 (2012)

    Google Scholar 

  25. G. Bianconi, Statistical mechanics of multiplex networks: entropy and overlap. Phys. Rev. E 87(6), 062806 (2013)

    Article  Google Scholar 

  26. V.D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfolding of communities in large networks. J. Stat. Mech. Theory Experiment 2008(10), P10008 (2008)

    Article  Google Scholar 

  27. S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, D.-U. Hwang, Complex networks: structure and dynamics. Phys. Rep. 424(4), 175–308 (2006)

    Article  MathSciNet  Google Scholar 

  28. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, J. Taylor, Freebase: a collaboratively created graph database for structuring human knowledge, in Proceedings of SIGMOD (2008)

    Google Scholar 

  29. D. Braha, Y. Bar-Yam, Time-dependent complex networks: dynamic centrality, dynamic motifs, and cycles of social interactions, in Adaptive Networks: Theory, Models and Applications (Springer, Heidelberg, 2009), pp. 39–50

    Google Scholar 

  30. P. Bródka, K. Musial, P. Kazienko, A method for group extraction in complex social networks, in Knowledge Management, Information Systems, E-Learning, and Sustainability Research, ed. by M.D. Lytras, P. Ordonez De Pablos, A. Ziderman, A. Roulstone, H. Maurer, J.B. Imber (Springer, Heidelberg, 2010), pp. 238–247

    Chapter  Google Scholar 

  31. P. Bródka, K. Skibicki, P. Kazienko, K. Musiał, A degree centrality in multi-layered social network, in 2011 International Conference on Computational Aspects of Social Networks (CASoN) (IEEE, 2011), pp. 237–242

    Google Scholar 

  32. P. Bródka, P. Kazienko, K. Musiał, K. Skibicki, Analysis of neighbourhoods in multi-layered dynamic social networks. Int. J. Comput. Intell. Syst. 5(3), 582–596 (2012)

    Article  Google Scholar 

  33. A. Cardillo, J.Gómez-Gardeñes, M. Zanin, M. Romance, D. Papo, F. del Pozo, S. Boccaletti, Emergence of network features from multiplexity. Sci. Rep. 3 (2013)

    Google Scholar 

  34. A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E.R. Hruschka Jr., T.M. Mitchell, Toward an architecture for never-ending language learning, in Proceedings of AAAI (2010)

    Google Scholar 

  35. Cassovary. https://github.com/twitter/cassovary. Accessed 10 March 2016

  36. D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, C. Faloutsos, Epidemic thresholds in real networks. ACM Trans. Inf. Syst. Secur. 10(4), 1–26 (2008)

    Article  Google Scholar 

  37. A. Ching, S. Edunov, M. Kabiljo, D. Logothetis, S. Muthukrishnan, One trillion edges: graph processing at facebook-scale. Proc. VLDB Endow. 8(12), 1804–1815 (2015)

    Article  Google Scholar 

  38. N.M.K. Chowdhury, R. Boutaba, A survey of network virtualization. Comput. Network. 54(5), 862–876 (2010)

    Article  MATH  Google Scholar 

  39. A. Clauset, M.E. Newman, C. Moore, Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004)

    Article  Google Scholar 

  40. G. D’Agostino, A. Scala, Networks of Networks: The Last Frontier of Complexity, vol. 340 (Springer, Heidelberg, 2014)

    Book  Google Scholar 

  41. M. De Domenico, A. Solé-Ribalta, E. Cozzo, M. Kivelä, Y. Moreno, M.A. Porter, S. Gómez, A. Arenas, Mathematical formulation of multilayer networks. Phys. Rev. X 3(4), 041022 (2013)

    Google Scholar 

  42. R.A. de Paula, A.P. Appel, C.S. Pinhanez, V.F. Cavalcante, C.S. Andrade, Using social analytics for studying work-networks: a novel, initial approach, in 2012 Brazilian Symposium on Collaborative Systems (SBSC), Oct 2012, pp. 146–153

    Google Scholar 

  43. O. Deshpande, D.S. Lamba, M. Tourn, S. Das, S. Subramaniam, A. Rajaraman, V. Harinarayan, A. Doan, Building, maintaining, and using knowledge bases: a report from the trenches, in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD’13 (ACM, New York, 2013), pp. 1209–1220

    Google Scholar 

  44. X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, W. Zhang, Knowledge vault: a web-scale approach to probabilistic knowledge fusion, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’14 (ACM, New York, 2014), pp. 601–610

    Google Scholar 

  45. Y. Dong, J. Tang, S. Wu, J. Tian, N.V. Chawla, J. Rao, H. Cao, Link prediction and recommendation across heterogeneous social networks, in Proceedings of the 2012 IEEE 12th International Conference on Data Mining, ICDM’12 (IEEE Computer Society, Washington, DC, 2012), pp. 181–190

    Google Scholar 

  46. D.M. Dunlavy, T.G. Kolda, E. Acar, Temporal link prediction using matrix and tensor factorizations. ACM Trans. Knowl. Discov. Data 5(2), 10:1–10:27 (2011)

    Article  Google Scholar 

  47. M. Faloutsos, P. Faloutsos, C. Faloutsos, On power-law relationships of the internet topology, in ACM SIGCOMM Computer Communication Review, vol. 29 (ACM, 1999), pp. 251–262

    Google Scholar 

  48. Faunus: Graph Analytics Engine. http://thinkaurelius.github.io/faunus/. Accessed 10 March 2016

  49. S. Fortunato, Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  50. S. Fortunato, C. Castellano, Community structure in graphs, in Computational Complexity, ed. by R.A. Meyers (Springer, Heidelberg, 2012), pp. 490–512

    Chapter  Google Scholar 

  51. Galois: The University of Texas at Austin. http://iss.ices.utexas.edu/?p=projects/galois. Accessed 10 March 2016

  52. J. Gao, S.V. Buldyrev, S. Havlin, H.E. Stanley, Robustness of a network of networks. Phys. Rev. Lett. 107(19), 195701 (2011)

    Article  Google Scholar 

  53. Gephi: The Open Graph Viz Platform. https://gephi.org/. Accessed 10 March 2016

  54. M. Girvan, M.E. Newman, Community structure in social and biological networks. Proc. National Acad. Sci. 99(12), 7821–7826 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  55. D.F. Gleich, M.W. Mahoney, Mining large graphs, in Handbook of Big Data (2016), p. 191

    Google Scholar 

  56. S. Gomez, A. Diaz-Guilera, J. Gomez-Gardeñes, C.J. Perez-Vicente, Y. Moreno, A. Arenas, Diffusion dynamics on multiplex networks. Phys. Rev. Lett. 110(2), 028701 (2013)

    Article  Google Scholar 

  57. J. Gómez-Gardeñes, I. Reinares, A. Arenas, L.M. Floría, Evolution of cooperation in multiplex networks. Sci. Rep. 2 (2012)

    Google Scholar 

  58. J.E. Gonzalez, Y. Low, H. Gu, D. Bickson, C. Guestrin, Powergraph: distributed graph-parallel computation on natural graphs, in Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12) (2012), pp. 17–30

    Google Scholar 

  59. Grafos.ML - Empowering Giraph. http://grafos.ml/index.html. Accessed 10 March 2016

  60. M. Granovetter, The strength of weak ties. Am. J. Sociol. 78(6), 1360–1380 (1973)

    Article  Google Scholar 

  61. GraphEngine: serving big graphs in real-time. http://www.graphengine.io/. Accessed 10 March 2016

  62. GraphLab Create - an extensible machine learning framework. https://dato.com/products/create/. Accessed 10 March 2016

  63. GraphX: Apache Spark’s API for graphs and graph-parallel computation. http://spark.apache.org/graphx/. Accessed 10 March 2016

  64. T. Gruber, What is an ontology (1993). WWW Site http://www-ksl.stanford.edu/kst/whatis-an-ontology.html. Accessed 07 Sep 2004

  65. GUESS: The graph exploration system. http://graphexploration.cond.org. Accessed 10 March 2016

  66. P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, R. Zadeh, Wtf: the who to follow service at twitter, in Proceedings of the 22nd International Conference on World Wide Web Conferences Steering Committee (2013), pp. 505–514

    Google Scholar 

  67. I. Guy, S. Ur, I. Ronen, A. Perer, M. Jacovi, Do you want to know?: recommending strangers in the enterprise, in Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, CSCW’11 (ACM, New York, 2011), pp. 285–294

    Google Scholar 

  68. A. Halu, R.J. Mondragón, P. Panzarasa, G. Bianconi, Multiplex pagerank. PloS One 8(10), e78293 (2013)

    Article  Google Scholar 

  69. M.A. Hasan, M.J. Zaki, A survey of link prediction in social networks, in Social Network Data Analytics, ed. by C.C. Aggarwal (Springer, Boston, 2011), pp. 243–275

    Chapter  Google Scholar 

  70. High-productivity software for complex networks. https://networkx.github.io/. Accessed 10 March 2016

  71. P. Holme, C. Edling, F. Liljeros, Structure and time-evolution of an internet dating community. Soc. NetworK. 26, 155 (2004)

    Article  Google Scholar 

  72. P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)

    Article  Google Scholar 

  73. P. Hu, W.C. Lau, A survey and taxonomy of graph sampling. arXiv preprint arXiv:1308.5865 (2013)

  74. IBM Graph: easy-to-use, fully-managed graph database service. https://new-console.ng.bluemix.net/catalog/services/ibm-graph/. Accessed 10 March 2016

  75. IBM System G. http://systemg.research.ibm.com/. Accessed 10 March 2016

  76. igraph: The network analysis package. http://igraph.org/. Accessed 10 March 2016

  77. M. Jha, C. Seshadhri, A. Pinar, A space efficient streaming algorithm for triangle counting using the birthday paradox, in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2013), pp. 589–597

    Google Scholar 

  78. U. Kang, C. Faloutsos, Big graph mining: algorithms and discoveries. ACM SIGKDD Explor. Newslett. 14(2), 29–36 (2013)

    Article  Google Scholar 

  79. U. Kang, C.E. Tsourakakis, A.P. Appel, C. Faloutsos, J. Leskovec, Hadi: mining radii of large graphs. ACM Trans. Knowl. Discov. Data (TKDD) 5(2), 8 (2011)

    Google Scholar 

  80. L. Katz, A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953). March

    Article  MathSciNet  MATH  Google Scholar 

  81. D. Kempe, J. Kleinberg, A. Kumar, Connectivity and inference problems for temporal networks, in Proceedings of the Thirty-second Annual ACM Symposium on Theory of Computing, STOC’00 (ACM, New York, 2000), pp. 504–513

    Google Scholar 

  82. M. Kivelä, A. Arenas, M. Barthelemy, J.P. Gleeson, Y. Moreno, M.A. Porter, Multilayer networks. J. Complex Network. 2(3), 203–271 (2014)

    Article  Google Scholar 

  83. X. Kong, J. Zhang, P.S. Yu, Inferring anchor links across multiple heterogeneous social networks, in Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management, CIKM’13 (ACM, New York, 2013), pp. 179–188

    Google Scholar 

  84. J. Kunegis, A. Lommatzsch, C. Bauckhage, The slashdot zoo: mining a social network with negative edges, in Proceedings of the 18th International Conference on World Wide Web, WWW’09 (ACM, New York, 2009, pp. 741–750

    Google Scholar 

  85. M. Kurant, M. Gjoka, C.T. Butts, A. Markopoulou, Walking on a graph with a magnifying glass: stratified sampling via weighted random walks, in Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems (ACM, 2011), pp. 281–292

    Google Scholar 

  86. N. Lao, T. Mitchell, W.W. Cohen, Random walk inference and learning in a large scale knowledge base, in Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Edinburgh, 2011), pp. 529–539

    Google Scholar 

  87. C.-H. Lee, X. Xu, D.Y. Eun, Beyond random walk and metropolis-hastings samplers: why you should not backtrack for unbiased graph sampling, in ACM SIGMETRICS Performance Evaluation Review, vol. 40 (ACM, 2012), pp. 319–330

    Google Scholar 

  88. J. Leskovec, C. Faloutsos, Sampling from large graphs, in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data mining (ACM, 2006), pp. 631–636

    Google Scholar 

  89. J. Leskovec, D. Huttenlocher, J. Kleinberg, Predicting positive and negative links in online social networks, in Proceedings of the 19th International Conference on World Wide Web, WWW’10 (ACM, New York, 2010), pp. 641–650

    Google Scholar 

  90. J. Leskovec, J. Kleinberg, C. Faloutsos, Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1) (2007)

    Google Scholar 

  91. J. Leskovec, L. Backstrom, R. Kumar, A. Tomkins, Microscopic evolution of social networks, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’08 (ACM, New York, 2008), pp. 462–470

    Google Scholar 

  92. J. Leskovec, K.J. Lang, A. Dasgupta, M.W. Mahoney, Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  93. D. Liben-Nowell, J. Kleinberg, The link prediction problem for social networks, in Proceedings of the Twelfth International Conference on Information and Knowledge Management, CIKM’03 (ACM, New York, 2003), pp. 556–559

    Google Scholar 

  94. W. Liu, L. Lü, Link prediction based on local random walk. EPL (Europhysics Letters) 89(5), 58007 (2010)

    Article  Google Scholar 

  95. L. Lü, T. Zhou, Role of weak ties in link prediction of complex networks, in Proceedings of the 1st ACM International Workshop on Complex Networks Meet Information & Knowledge Management, CNIKM’09 (ACM, New York, 2009), pp. 55–58

    Google Scholar 

  96. L. Lü, T. Zhou, Link prediction in weighted networks: the role of weak ties. EPL (Europhysics Letters) 89(1), 18001 (2010)

    Article  Google Scholar 

  97. L. Lü, T. Zhou, Link prediction in complex networks: a survey. Physica A 390(6), 1150–1170 (2011)

    Article  Google Scholar 

  98. G. Malewicz, M.H. Austern, A.J. Bik, J.C. Dehnert, I. Horn, N. Leiser, G. Czajkowski, Pregel: a system for large-scale graph processing, in Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (ACM, 2010), pp. 135–146

    Google Scholar 

  99. P. Massa, P. Avesani, Controversial users demand local trust metrics: an experimental study on epinions.com community, in Proceedings of the 20th National Conference on Artificial Intelligence - Volume 1, AAAI’05 (AAAI Press, 2005), pp. 121–126

    Google Scholar 

  100. M. McGlohon, L. Akoglu, C. Faloutsos, Weighted graphs and disconnected components: patterns and a generator, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’08 (ACM, New York, 2008), pp. 524–532

    Google Scholar 

  101. A. McGregor, Graph stream algorithms: a survey. ACM SIGMOD Rec. 43(1), 9–20 (2014)

    Article  Google Scholar 

  102. G. Menichetti, D. Remondini, P. Panzarasa, R.J. Mondragón, G. Bianconi, Weighted multiplex networks. CoRR, abs/1312.6720 (2013)

    Google Scholar 

  103. T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in Advances in Neural Information Processing Systems (2013), pp. 3111–3119

    Google Scholar 

  104. S. Milgram, The small world problem. Psychol. Today 2(1), 60–67 (1967)

    Google Scholar 

  105. R.G. Morris, M. Barthelemy, Transport on coupled spatial networks. Phys. Rev. Lett. 109(12), 128703 (2012)

    Article  Google Scholar 

  106. P.J. Mucha, M.A. Porter, Communities in multislice voting networks. Chaos 20(4), 041108 (2010)

    Article  Google Scholar 

  107. P.J. Mucha, T. Richardson, K. Macon, M.A. Porter, J.-P. Onnela, Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  108. Neo4j: The World’s Leading Graph Database. http://neo4j.com/. Accessed 10 March 2016

  109. M.E.J. Newman, The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  110. M.E. Newman, Modularity and community structure in networks. Proc. National Acad. Sci. 103(23), 8577–8582 (2006)

    Article  Google Scholar 

  111. M. Newman, Networks: An Introduction (Oxford University Press, Oxford, 2010)

    Book  MATH  Google Scholar 

  112. M.E. Newman, M. Girvan, Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)

    Article  Google Scholar 

  113. M.K.-P. Ng, X. Li, Y. Ye, Multirank: co-ranking for objects and relations in multi-relational data, in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2011), pp. 1217–1225

    Google Scholar 

  114. F. Niu, C. Zhang, C. Ré, J. Shavlik, Elementary: large-scale knowledge-base construction via machine learning and statistical inference. Int. J. Semant. Web Inf. Syst. 8(3), 42–73 (2012). July

    Article  Google Scholar 

  115. OrientDB: Distributed Graph Database. http://orientdb.com/. Accessed 10 March 2016

  116. L. Page, S. Brin, R. Motwani, T. Winograd, The pagerank citation ranking: bringing order to the web (1999)

    Google Scholar 

  117. C.R. Palmer, P.B. Gibbons, C. Faloutsos, Anf: a fast and scalable tool for data mining in massive graphs, in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2002), pp. 81–90

    Google Scholar 

  118. Y. Park, M. Shankar, B.-H. Park, J. Ghosh, Graph databases for large-scale healthcare systems: a framework for efficient data management and data services, in 2014 IEEE 30th International Conference on Data Engineering Workshops (ICDEW) (IEEE, 2014), pp. 12–19

    Google Scholar 

  119. A. Pavan, K. Tangwongsan, S. Tirthapura, K.-L. Wu, Counting and sampling triangles from a graph stream. Proc. VLDB Endow. 6(14), 1870–1881 (2013)

    Article  Google Scholar 

  120. PEGASUS - Peta-scale graph mining system. http://www.cs.cmu.edu/~pegasus/. Accessed 10 March 2016

  121. B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: online learning of social representations, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2014), pp. 701–710

    Google Scholar 

  122. M.A. Rodriguez, The gremlin graph traversal machine and language (invited talk), in Proceedings of the 15th Symposium on Database Programming Languages (ACM, 2015), pp. 1–10

    Google Scholar 

  123. B. Shao, H. Wang, Y. Li, The trinity graph engine. Microsoft Res., 54 (2012)

    Google Scholar 

  124. B. Shao, H. Wang, Y. Li, Trinity: a distributed graph engine on a memory cloud, in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (ACM, 2013), pp. 505–516

    Google Scholar 

  125. SNAP: Stanford Network Analysis Platform. http://snap.stanford.edu/. Accessed 10 March 2016

  126. D. Song, D.A. Meyer, D. Tao, Efficient latent link recommendation in signed networks, in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’15 (ACM, New York, 2015), pp. 1105–1114

    Google Scholar 

  127. S. Soundarajan, J. Hopcroft, Using community information to improve the precision of link prediction methods, in Proceedings of the 21st International Conference on World Wide Web, WWW’12 Companion (ACM, New York, 2012), pp. 607–608

    Google Scholar 

  128. Sparkse: Scalable high-performance graph database. http://www.sparsity-technologies.com/. Accessed 10 March 2016

  129. M. Spiliopoulou, Evolution in social networks: a survey, in Social Network Data Analytics, ed. by C.C. Aggarwal (Springer, Heidelberg, 2011), pp. 149–175

    Chapter  Google Scholar 

  130. N.V. Spirin, J. He, M. Develin, K.G. Karahalios, M. Boucher, People search within an online social network: large scale analysis of facebook graph search query logs, in Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (ACM, 2014), pp. 1009–1018

    Google Scholar 

  131. F.M. Suchanek, G. Kasneci, G. Weikum, Yago: a core of semantic knowledge, in Proceedings of WWW (2007)

    Google Scholar 

  132. X. Sui, T.-H. Lee, J.J. Whang, B. Savas, S. Jain, K. Pingali, I. Dhillon, Parallel clustered low-rank approximation of graphs and its application to link prediction, in Languages and Compilers for Parallel Computing (Springer, 2012), pp. 76–95

    Google Scholar 

  133. Y. Sun, R. Barber, M. Gupta, C.C. Aggarwal, J. Han, Co-author relationship prediction in heterogeneous bibliographic networks, in Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining, ASONAM’11 (IEEE Computer Society, Washington, DC, 2011), pp. 121–128

    Google Scholar 

  134. Y. Sun, J. Han, C.C. Aggarwal, N.V. Chawla, When will it happen?: relationship prediction in heterogeneous information networks, in Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM’12 (ACM, New York, 2012), pp. 663–672

    Google Scholar 

  135. J. Sun, C.K. Reddy, Big data analytics for healthcare, in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2013), pp. 1525–1525

    Google Scholar 

  136. J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, Line: large-scale information network embedding, in Proceedings of the 24th International Conference on World Wide Web Conferences Steering Committee (2015), pp. 1067–1077

    Google Scholar 

  137. S. Tasci, M. Demirbas, Giraphx: parallel yet serializable large-scale graph processing, in Euro-Par 2013 Parallel Processing, ed. by F. Wolf, B. Mohr, D. an Mey (Springer, Heidelberg, 2013), pp. 458–469

    Chapter  Google Scholar 

  138. T.T. Tchrakian, B. Basu, M. O’Mahony, Real-time traffic flow forecasting using spectral analysis. IEEE Trans. Intell. Transp. Syst. 13(2), 519–526 (2012)

    Article  Google Scholar 

  139. Y. Tian, A. Balmin, S.A. Corsten, S. Tatikonda, J. McPherson, From think like a vertex to think like a graph. Proc. VLDB Endow. 7(3), 193–204 (2013)

    Article  Google Scholar 

  140. TinkerPop: an Apache2 licensed graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP). http://tinkerpop.apache.org/. Accessed 10 March 2016

  141. Titan: Distributed Graph Database. http://thinkaurelius.github.io/titan/. Accessed 10 March 2016

  142. C.E. Tsourakakis, Fast counting of triangles in large real networks without counting: algorithms and laws, in ICDM’08 (IEEE Computer Society, Washington, DC, 2008), pp. 608–617

    Google Scholar 

  143. T. Wang, Y. Chen, Z. Zhang, T. Xu, L. Jin, P. Hui, B. Deng, X. Li, Understanding graph sampling algorithms for social network analysis, in Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops, ICDCSW’11) (IEEE Computer Society, Washington, DC, 2011), pp. 123–128

    Google Scholar 

  144. W.Y. Wang, K. Mazaitis, W.W. Cohen, Programming with personalized pagerank: a locally groundable first-order probabilistic logic, in Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM 2013) (2013, to appear)

    Google Scholar 

  145. D. Wang, D. Pedreschi, C. Song, F. Giannotti, A.-L. Barabasi, Human mobility, social ties, and link prediction, in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’11 (ACM, New York 2011), pp. 1100–1108

    Google Scholar 

  146. D.J. Watts, S.H. Strogatz, Collective dynamics of’small-world’networks. Nature 393(6684), 409–10 (1998)

    Article  Google Scholar 

  147. K. Wehmuth, A. Ziviani, E. Fleury, A unifying model for representing time-varying graphs. In 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015, Campus des Cordeliers, Paris, France, 19–21 October 2015 (2015), pp. 1–10, 2015

    Google Scholar 

  148. P.C. Wong, C. Chen, C. Gorg, B. Shneiderman, J. Stasko, J. Thomas, Graph analyticslessons learned and challenges ahead. IEEE Comput. Graph. Appl. 5, 18–29 (2011)

    Article  Google Scholar 

  149. S.H. Yook, H. Jeong, A.L. Barabasi, Weighted evolving networks. Phys. Rev. Lett. 86(25), 5835–5838 (2001)

    Article  Google Scholar 

  150. J. Zhang, X. Kong, P.S. Yu, Transferring heterogeneous links across location-based social networks, in Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM’14 (ACM, New York, 2014), pp. 303–312

    Google Scholar 

  151. Y. Zhao, Mining Large Graphs. Ph.D. thesis, University of Illinois at Chicago (2013)

    Google Scholar 

  152. D. Zhou, S.A. Orshanskiy, H. Zha, C.L. Giles, Co-ranking authors and documents in a heterogeneous network, in Seventh IEEE International Conference on Data Mining, 2007. ICDM 2007 (IEEE, 2007), pp. 739–744

    Google Scholar 

  153. R. Zou, L.B. Holder, Frequent subgraph mining on a single large graph using sampling techniques, in Proceedings of the Eighth Workshop on Mining and Learning with Graphs (ACM, 2010), pp. 171–178

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana Paula Appel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Appel, A.P., Moyano, L.G. (2017). Link and Graph Mining in the Big Data Era. In: Zomaya, A., Sakr, S. (eds) Handbook of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-49340-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49340-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49339-8

  • Online ISBN: 978-3-319-49340-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics