Abstract
Triangle enumeration is considered a fundamental graph analytics problem with many applications including detecting fake accounts, spam detection, and community searches. Real world graph data sets are growing to unprecedented levels and many of the existing algorithms fail to process them or take a very long time to produce results. Many organizations invest in new hardware and new services in order to be able keep up with the data growth and often neglect the well established and widely used relational database management systems (RDBMSs). In this paper we present a carefully engineered RDBMS solution to the problem of triangle enumeration for very large graphs. We show that RDBMSs are suitable tools for enumerating billions of triangles in billion-scale networks on a consumer grade server. Also, we compare our RDBMS solution’s performance to a native graph database and show that our RDBMS solution outperforms by order of magnitude.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Mahanthappa, S., Avarkar, B.: Data Formats and Its Research Challenges in IoT: A Survey. Springer (2020)
Park, H., Silvestri, F., Kang, U., Pagh, R.: Mapreduce triangle enumeration with guarantees (2014)
Chu, S., Cheng, J.: Triangle listing in massive networks. ACM Tkdd 6, 1–32 (2012)
Hu, X., Tao, Y., Chung, C.: Massive graph triangulation (2013)
Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Tcs 407, 458–473 (200)
Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks (2012)
Schank, T.: Algorithmic aspects of triangle-based network analysis (2007)
Kelly, R.: Internet of things data to top 1.6 zettabytes by 2022. Campus Technol. 9, 1536–1233 (201)
Mcafee, A., Brynjolfsson, E., Davenport, T., Patil, D., Barton, D.: Big data: the management revolution. Harv. Bus. Rev. 90, 60–68 (2012)
Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. CACM 57, 86–94 (2014)
Ahmed, A., Thomo, A.: Pagerank for billion-scale networks in RDBMS (2020)
Codd, E.: A relational model of data for large shared data banks. Springer (2002)
Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Tkde 5, 914–925 (1993)
Ordonez, C., Omiecinski, E.: Efficient disk-based K-means clustering for relational databases. IEEE Tkde 16, 909–921 (2004)
Gao, J., Zhou, J., Yu, J., Wang, T.: Shortest path computing in relational DBMSs. IEEE Tkde 26, 997–1011 (2013)
Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40, 1–39 (200)
Arifuzzaman, S., Khan, M., Marathe, M.: PATRIC: a parallel algorithm for counting triangles in massive networks (2013)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Natl. Acad. Sci. 101, 2658–2663 (2004)
Berry, J., Hendrickson, B., Laviolette, R., Phillips, C.: Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E 83, 056119 (2011)
Yu, M., Qin, L., Zhang, Y., Zhang, W., Lin, X.: Aot: pushing the efficiency boundary of main-memory triangle listing (2020)
Park, H., Chung, C.: An efficient MapReduce algorithm for counting triangles in a very large graph (2013)
Park, H., Myaeng, S., Kang, U.: PTE: enumerating trillion triangles on distributed systems (2016)
Ahmed, A., Thomo, A.: Computing source-to-target shortest paths for complex networks in RDBMS. J. Comput. Syst. Sci. 89, 114–129 (2017)
Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer (2011)
Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, New York (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ahmed, A., Enns, K., Thomo, A. (2021). Triangle Enumeration for Billion-Scale Graphs in RDBMS. In: Barolli, L., Woungang, I., Enokido, T. (eds) Advanced Information Networking and Applications. AINA 2021. Lecture Notes in Networks and Systems, vol 226. Springer, Cham. https://doi.org/10.1007/978-3-030-75075-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-75075-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75074-9
Online ISBN: 978-3-030-75075-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)