skip to main content
10.1145/3210259.3210262acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Graphtides: a framework for evaluating stream-based graph processing platforms

Published:10 June 2018Publication History

ABSTRACT

Stream-based graph systems continuously ingest graph-changing events via an established input stream, performing the required computation on the corresponding graph. While there are various benchmarking and evaluation approaches for traditional, batch-oriented graph processing systems, there are no common procedures for evaluating stream-based graph systems. We, therefore, present GraphTides, a generic framework which includes the definition of an appropriate system model, an exploration of the parameter space, suitable workloads, and computations required for evaluating such systems. Furthermore, we propose a methodology and provide an architecture for running experimental evaluations. With our framework, we hope to systematically support system development, performance measurements, engineering, and comparisons of stream-based graph systems.

References

  1. Daniel J. Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: a new model and architecture for data stream management. the VLDB Journal 12, 2 (2003), 120--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Khaled Ammar and M Tamer Özsu. 2013. WGB: towards a universal graph benchmark. In Workshop on Big Data Benchmarks. Springer, 58--72.Google ScholarGoogle Scholar
  3. Arvind Arasu, Mitch Cherniack, Eduardo Galvez, David Maier, Anurag S Maskey, Esther Ryvkina, Michael Stonebraker, and Richard Tibbetts. 2004. Linear road: a stream data management benchmark. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. VLDB Endowment, 480--491. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Timothy G. Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. LinkBench: A Database Benchmark Based on the Facebook Social Graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, New York, NY, USA, 1185--1196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. David A. Bader, John Feo, John Gilbert, Jeremy Kepner, David Koester, Eugene Loh, Kamesh Madduri, Bill Mann, Theresa Meuse, and Eric Robinson. 2009. HPC Scalable Graph Analysis Benchmark. (2009).Google ScholarGoogle Scholar
  6. Christian Bizer and Andreas Schultz. 2009. The Berlin SPARQL Benchmark. International Journal on Semantic Web and Information Systems (IJSWIS) 5, 2 (2009), 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  7. Mihai Capotă, Tim Hegeman, Alexandru Iosup, Arnau Prat-Pérez, Orri Erling, and Peter Boncz. 2015. Graphalytics: A big data benchmark for graph-processing platforms. In Proceedings of the GRADES'15. ACM, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A recursive model for graph mining. In Proceedings of the 2004 SIAM International Conference on Data Mining. SIAM, 442--446.Google ScholarGoogle ScholarCross RefCross Ref
  9. Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, and Enhong Chen. 2012. Kineograph: Taking the Pulse of a Fast-changing and Connected World. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys '12). ACM, New York, NY, USA, 85--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sanket Chintapalli, Derek Dagit, Bobby Evans, Reza Farivar, Thomas Graves, Mark Holderbaugh, Zhuo Liu, Kyle Nusbaum, Kishorkumar Patil, Boyang Jerry Peng, et al. 2016. Benchmarking streaming computation engines: storm, flink and spark streaming. In Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE, 1789--1792.Google ScholarGoogle ScholarCross RefCross Ref
  11. Marek Ciglan, Alex Averbuch, and Ladialav Hluchy. 2012. Benchmarking traversal operations over graph databases. In Data Engineering Workshops (ICDEW), 2012 IEEE 28th International Conference on. IEEE, 186--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Graph 500 Steering Committee. 2017. Graph 500 Benchmarks v2.0. https://graph500.org/. (June 2017).Google ScholarGoogle Scholar
  13. Miyuru Dayarathna and Toyotaro Suzumura. 2012. XGDBench: A benchmarking platform for graph stores in exascale clouds. In Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on. IEEE, 363--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ayush Dubey, Greg D Hill, Robert Escriva, and Emin Gün Sirer. 2016. Weaver: a high-performance, transactional graph database based on refinable timestamps. Proceedings of the VLDB Endowment 9, 11 (2016), 852--863. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Benjamin Erb, Dominik Meissner, Jakob Pietron, and Frank Kargl. 2017. Chronograph: A Distributed Processing Platform for Online and Batch Computations on Event-sourced Graphs. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS '17). ACM, New York, NY, USA, 78--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Orri Erling, Alex Averbuch, Josep Larriba-Pey, Hassan Chafi, Andrey Gubichev, Arnau Prat, Minh-Duc Pham, and Peter Boncz. 2015. The LDBC Social Network Benchmark: Interactive Workload. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 619--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, and Hans-Arno Jacobsen. 2013. BigBench: towards an industry standard benchmark for big data analytics. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 1197--1208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). USENIX, Hollywood, CA, 17--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph Processing in a Distributed Dataflow Framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI'14). USENIX Association, Berkeley, CA, USA, 599--613. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin. 2005. LUBM: A benchmark for OWL knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web 3, 2-3 (2005), 158--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yong Guo, Ana Lucia Varbanescu, Alexandru Iosup, Claudio Martella, and Theodore L Willke. 2014. Benchmarking graph-processing platforms: a vision. In Proceedings of the 5th ACM/SPEC international conference on Performance engineering. ACM, 289--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, and Enhong Chen. 2014. Chronos: AGraph Engine for Temporal Graph Analysis. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys '14). ACM, New York, NY, USA, Article 1, 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Thomas Hartmann, Francois Fouquet, Matthieu Jimenez, Romain Rouvoy, and Yves Le Traon. 2017. Analyzing ComplexData in Motion at Scale with Temporal Graphs. In The 29th International Conference on Software Engineering & Knowledge Engineering (SEKE'17). KSI Research, 6.Google ScholarGoogle ScholarCross RefCross Ref
  24. Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2010. The Hi-Bench benchmark suite: Characterization of the MapReduce-based data analysis. In Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on. IEEE, 41--51.Google ScholarGoogle ScholarCross RefCross Ref
  25. IEEE. 2008. IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems. IEEE Std 1588-2008 (Revision of IEEE Std 1588-2002) (July 2008), 1--300.Google ScholarGoogle Scholar
  26. Alexandru Iosup, Tim Hegeman, Wing Lung Ngai, Stijn Heldens, Arnau Prat-Pérez, Thomas Manhardto, Hassan Chafio, Mihai Capotă, Narayanan Sundaram, Michael Anderson, et al. 2016. Ldbc graphalytics: A benchmark for large-scale graph analysis on parallel and distributed platforms. Proceedings of the VLDB Endowment 9, 13 (2016), 1317--1328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Anand Padmanabha Iyer, Li Erran Li, Tathagata Das, and Ion Stoica. 2016. Time-evolving graph processing at scale. In Proc. of the 4th International Workshop on Graph Data Management Experiences and Systems. ACM, 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Raj Jain. 1991. The Art of Computer Systems Performance Analysis - Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley.Google ScholarGoogle Scholar
  29. Ivo Jimenez, Michael Sevilla, Noah Watkins, Carlos Maltzahn, Jay Lofstead, Kathryn Mohror, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2017. The popper convention: Making reproducible systems evaluation practical. In Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International. IEEE, 1561--1570.Google ScholarGoogle ScholarCross RefCross Ref
  30. Martin Junghanns, André Petermann, Martin Neumann, and Erhard Rahm. 2017. Management and Analysis of Big Graph Data: Current Systems and Open Challenges. In Handbook of Big Data Technologies. Springer, 457--505.Google ScholarGoogle Scholar
  31. Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD '05). ACM, New York, NY, USA, 177--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Min Li, Jian Tan, Yandong Wang, Li Zhang, and Valentina Salapura. 2015. Spark-bench: a comprehensive benchmarking suite for in memory data analytic platform spark. In Proceedings of the 12th ACM International Conference on Computing Frontiers. ACM, 53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ruirui Lu, Gang Wu, Bin Xie, and Jingtong Hu. 2014. Stream bench: Towards benchmarking modern distributed stream computing frameworks. In Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on. IEEE, 69--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A System for Large-scale Graph Processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD '10). ACM, New York, NY, USA, 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Andrew McGregor. 2014. Graph Stream Algorithms: A Survey. SIGMOD Rec. 43, 1 (May 2014), 9--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Frank McSherry, Michael Isard, and Derek G. Murray. 2015. Scalability! But at what COST?. In 15th Workshop on Hot Topics in Operating Systems. USENIX Association, Kartause Ittingen, Switzerland. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Othon Michail and Paul G. Spirakis. 2018. Elements of the Theory of Dynamic Networks. Commun. ACM 61, 2 (Jan. 2018), 72--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Gurmeet Manku, Chris Olston, Justin Rosenstein, and Rohit Varma. 2003. Query Processing, Resource Management, and Approximation in a Data Stream Management System-. In IN CIDR. Citeseer.Google ScholarGoogle Scholar
  39. Wing Lung Ngai, Tim Hegeman, Stijn Heldens, and Alexandru Iosup. 2017. Granula: Toward Fine-grained Performance Analysis of Large-scale Graph Processing Platforms. In Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems (GRADES'17). ACM, New York, NY, USA, Article 8, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Anil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu. 2017. Do We Need Specialized Graph Databases?: Benchmarking Real-Time Social Networking Applications. In Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems (GRADES'17). ACM, New York, NY, USA, Article 12, 7 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Himchan Park and Min-Soo Kim. 2017. TrillionG: A Trillion-scale Synthetic Graph Generator Using a Recursive Vector Model. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). ACM, New York, NY, USA, 913--928. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Arnau Prat-Pérez, Joan Guisado-Gámez, Xavier Fernández Salas, Petr Koupy, Siegfried Depner, and Davide Basilio Bartolini. 2017. Towards a Property Graph Generator for Benchmarking. In Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems (GRADES'17). ACM, New York, NY, USA, Article 6, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Jari Saramäki and Esteban Moro. 2015. From seconds to months: an overview of multi-scale dynamics of mobile telephone calls. The European Physical Journal B 88, 6 (24 Jun 2015), 164.Google ScholarGoogle ScholarCross RefCross Ref
  44. Yogesh Simmhan, Alok Kumbhare, Charith Wickramaarachchi, Soonil Nagarkar, Santosh Ravi, Cauligi Raghavendra, and Viktor Prasanna. 2014. GoFFish: A Subgraph Centric Framework for Large-Scale Graph Analytics. In Euro-Par 2014 Parallel Processing, Fernando Silva, Inês Dutra, and Vítor Santos Costa (Eds.). Lecture Notes in Computer Science, Vol. 8632. Springer International Publishing, 451--462.Google ScholarGoogle Scholar
  45. Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and Accurate Computations on Streaming Graphs via Trimmed Approximations. SIGARCH Comput. Archit. News 45, 1 (April 2017), 237--251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, Wanling Gao, Zhen Jia, Yingjie Shi, Shujie Zhang, et al. 2014. Bigdatabench: A big data benchmark suite from internet services. In High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium on. IEEE, 488--499.Google ScholarGoogle ScholarCross RefCross Ref
  47. Christo Wilson, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Y. Zhao. 2012. Beyond Social Graphs: User Interactions in Online Social Networks and Their Implications. ACM Trans. Web 6, 4, Article 17 (Nov. 2012), 31 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Graphtides: a framework for evaluating stream-based graph processing platforms

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            GRADES-NDA '18: Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)
            June 2018
            94 pages
            ISBN:9781450356954
            DOI:10.1145/3210259

            Copyright © 2018 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 10 June 2018

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            GRADES-NDA '18 Paper Acceptance Rate10of26submissions,38%Overall Acceptance Rate29of61submissions,48%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader