Skip to main content

Triangle Enumeration for Billion-Scale Graphs in RDBMS

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 226))

Abstract

Triangle enumeration is considered a fundamental graph analytics problem with many applications including detecting fake accounts, spam detection, and community searches. Real world graph data sets are growing to unprecedented levels and many of the existing algorithms fail to process them or take a very long time to produce results. Many organizations invest in new hardware and new services in order to be able keep up with the data growth and often neglect the well established and widely used relational database management systems (RDBMSs). In this paper we present a carefully engineered RDBMS solution to the problem of triangle enumeration for very large graphs. We show that RDBMSs are suitable tools for enumerating billions of triangles in billion-scale networks on a consumer grade server. Also, we compare our RDBMS solution’s performance to a native graph database and show that our RDBMS solution outperforms by order of magnitude.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Mahanthappa, S., Avarkar, B.: Data Formats and Its Research Challenges in IoT: A Survey. Springer (2020)

    Google Scholar 

  2. Park, H., Silvestri, F., Kang, U., Pagh, R.: Mapreduce triangle enumeration with guarantees (2014)

    Google Scholar 

  3. Chu, S., Cheng, J.: Triangle listing in massive networks. ACM Tkdd 6, 1–32 (2012)

    Google Scholar 

  4. Hu, X., Tao, Y., Chung, C.: Massive graph triangulation (2013)

    Google Scholar 

  5. Latapy, M.: Main-memory triangle computations for very large (sparse (power-law)) graphs. Tcs 407, 458–473 (200)

    Google Scholar 

  6. Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks (2012)

    Google Scholar 

  7. Schank, T.: Algorithmic aspects of triangle-based network analysis (2007)

    Google Scholar 

  8. Kelly, R.: Internet of things data to top 1.6 zettabytes by 2022. Campus Technol. 9, 1536–1233 (201)

    Google Scholar 

  9. Mcafee, A., Brynjolfsson, E., Davenport, T., Patil, D., Barton, D.: Big data: the management revolution. Harv. Bus. Rev. 90, 60–68 (2012)

    Google Scholar 

  10. Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. CACM 57, 86–94 (2014)

    Google Scholar 

  11. Ahmed, A., Thomo, A.: Pagerank for billion-scale networks in RDBMS (2020)

    Google Scholar 

  12. Codd, E.: A relational model of data for large shared data banks. Springer (2002)

    Google Scholar 

  13. Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Tkde 5, 914–925 (1993)

    Google Scholar 

  14. Ordonez, C., Omiecinski, E.: Efficient disk-based K-means clustering for relational databases. IEEE Tkde 16, 909–921 (2004)

    Google Scholar 

  15. Gao, J., Zhou, J., Yu, J., Wang, T.: Shortest path computing in relational DBMSs. IEEE Tkde 26, 997–1011 (2013)

    Google Scholar 

  16. Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40, 1–39 (200)

    Google Scholar 

  17. Arifuzzaman, S., Khan, M., Marathe, M.: PATRIC: a parallel algorithm for counting triangles in massive networks (2013)

    Google Scholar 

  18. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Natl. Acad. Sci. 101, 2658–2663 (2004)

    Article  Google Scholar 

  19. Berry, J., Hendrickson, B., Laviolette, R., Phillips, C.: Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E 83, 056119 (2011)

    Google Scholar 

  20. Yu, M., Qin, L., Zhang, Y., Zhang, W., Lin, X.: Aot: pushing the efficiency boundary of main-memory triangle listing (2020)

    Google Scholar 

  21. Park, H., Chung, C.: An efficient MapReduce algorithm for counting triangles in a very large graph (2013)

    Google Scholar 

  22. Park, H., Myaeng, S., Kang, U.: PTE: enumerating trillion triangles on distributed systems (2016)

    Google Scholar 

  23. Ahmed, A., Thomo, A.: Computing source-to-target shortest paths for complex networks in RDBMS. J. Comput. Syst. Sci. 89, 114–129 (2017)

    Article  MathSciNet  Google Scholar 

  24. Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer (2011)

    Google Scholar 

  25. Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, New York (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aly Ahmed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ahmed, A., Enns, K., Thomo, A. (2021). Triangle Enumeration for Billion-Scale Graphs in RDBMS. In: Barolli, L., Woungang, I., Enokido, T. (eds) Advanced Information Networking and Applications. AINA 2021. Lecture Notes in Networks and Systems, vol 226. Springer, Cham. https://doi.org/10.1007/978-3-030-75075-6_13

Download citation

Publish with us

Policies and ethics