Abstract
A graph embedding algorithm embeds a graph into a low-dimensional space such that the embedding preserves the inherent properties of the graph. While graph embedding is fundamentally related to graph visualization, prior work did not exploit this connection explicitly. We develop Force2Vec that uses force-directed graph layout models in a graph embedding setting with an aim to excel in both machine learning (ML) and visualization tasks. We make Force2Vec highly parallel by mapping its core computations to linear algebra and utilizing multiple levels of parallelism available in modern processors. The resultant algorithm is an order of magnitude faster than existing methods (43\(\times \) faster than DeepWalk, on average) and can generate embeddings from graphs with billions of edges in a few hours. In comparison to existing methods, Force2Vec is better in graph visualization and performs comparably or better in ML tasks such as link prediction, node classification, and clustering. Source code is available at https://github.com/HipGraph/Force2Vec.This paper is an extension of a conference paper by Rahman et al. (in: 20th IEEE international conference on data mining, IEEE ICDM, 2020b) published in IEEE ICDM 2020.
Similar content being viewed by others
References
Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola AJ (2013) Distributed large-scale natural graph factorization. In: Proceedings of WWW, pp 37–48
Akoglu L, McGlohon M, Faloutsos C (2010) Oddball: spotting anomalies in weighted graphs. In: Proceedings of PAKDD. Springer, pp 410–421
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 10:P10008
Bolosky WJ, Scott ML (1993) False sharing and its effect on shared memory performance. In: 4th symposium on experimental distributed and multiprocessor systems, pp 57–71
Brandes U, Pich C (2008) An experimental study on distance-based graph drawing. In: International symposium on graph drawing. Springer, pp 218–229
Cai H, Zheng VW, Chang KC-C (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Cao S, Lu W, Xu Q (2015) GraRep: learning graph representations with global structural information. In: Proceedings of CIKM, pp 891–900
Chen H, Perozzi B, Hu Y, Skiena S (2018) HARP: hierarchical representation learning for networks. In: Proceedings of AAAI
De Luca F, Hossain I, Kobourov S, Börner K (2019) Multi-level tree based approach for interactive graph visualization with semantic zoom, arXiv preprint arXiv:1906.05996
Erdös P, Harary F, Tutte WT (1965) On the dimension of a graph. Mathematika 12(2):118–122
Fruchterman TM, Reingold EM (1991) Graph drawing by force-directed placement, Softw Pract Exp 21(11):1129–1164
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of KDD. ACM, pp 855–864
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of NIPS, pp 1024–1034
Henderson K, Gallagher B, Eliassi-Rad T, Tong H, Basu S, Akoglu L, Koutra D, Faloutsos C Li L (2012) RolX: structural role extraction & mining in large graphs. In: Proceedings of KDD, pp 1231–1239
Jacomy M, Venturini T, Heymann S, Bastian M (2014) ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software. PloS One 9(6)
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907
Lee JA, Verleysen M (2010) Scale-independent quality criteria for dimensionality reduction. Pattern Recognit Lett 31(14):2248–2257
Luo D, Nie F, Huang H, Ding CH (2011) Cauchy graph embedding. In: Proceedings of ICML, pp 553–560
Maaten Lvd, Hinton G (2008) Visualizing data using t-SNE, J Mach Learn Res 9(Nov):2579–2605
Martin S, Brown WM, Klavans R, Boyack KW (2011) Openord: an open-source toolbox for large graph layout. In: Visualization and data analysis 2011, Vol 7868, International society for optics and photonics, p 786806
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of NIPS, pp 3111–3119
Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web., Technical report, Stanford InfoLab
Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: Proceedings of KDD, pp 701–710
Rahman MK, Sujon MH, Azad A (2020a) BatchLayout: a batch-parallel force-directed graph layout algorithm in shared memory. In: Proceedings of PacificVis. IEEE, pp 16–25
Rahman MK, Sujon MH, Azad A (2020b), Force2Vec: parallel force-directed graph embedding. In: 20th IEEE international conference on data mining (IEEE ICDM)
Recht B, Re C, Wright S, Niu F (2011) Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: Proceedings of NIPS, pp 693–701
Ribeiro LF, Saverese PH, Figueiredo DR (2017) struc2vec: learning node representations from structural identity. In: Proceedings of KDD, pp 385–394
Tang J, Liu J, Zhang M, Mei Q (2016) Visualizing large-scale and high-dimensional data. In: Proceedings of WWW, pp 287–297
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) LINE: large-scale information network embedding. In: Proceedings of WWW, pp 1067–1077
Tsitsulin A, Mottin D, Karras P, Müller E (2018) Verse: versatile graph embeddings from similarity measures. In: Proceedings of WWW, pp 539–548
Tutte WT (1963) How to draw a graph. Proc Lond Math Soc 3(1):743–767
Walshaw C (2006) A multilevel algorithm for force-directed graph-drawing. J Graph Algorithms Appl 7(3):253–285
Whaley RC, Petitet A, Dongarra JJ (2001) Automated empirical optimization of software and the ATLAS project. Parallel Comput 27(1–2):3–35
Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2019) GraphSAINT: graph sampling based inductive learning method, arXiv preprint arXiv:1907.04931
Acknowledgements
We would like to thank anonymous reviewers for their feedback. Funding for this work was provided by the Indiana University Grand Challenge Precision Health Initiative.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rahman, M.K., Sujon, M.H. & Azad, A. Scalable force-directed graph representation learning and visualization. Knowl Inf Syst 64, 207–233 (2022). https://doi.org/10.1007/s10115-021-01634-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01634-9