research-article

Graphtides: a framework for evaluating stream-based graph processing platforms

Authors:
Benjamin Erb

Ulm University, Germany

Ulm University, Germany
View Profile

,
Dominik Meißner

Ulm University, Germany

Ulm University, Germany
View Profile

,
Frank Kargl

Ulm University, Germany

Ulm University, Germany
View Profile

,
Benjamin A. Steer

Queen Mary, University of London, London, United Kingdom

Queen Mary, University of London, London, United Kingdom
View Profile

,
Felix Cuadrado

Queen Mary, University of London, London, United Kingdom

Queen Mary, University of London, London, United Kingdom
View Profile

,
Domagoj Margan

Imperial College London, London, United Kingdom

Imperial College London, London, United Kingdom
View Profile

,
Peter Pietzuch

Imperial College London, London, United Kingdom

Imperial College London, London, United Kingdom
View Profile

GRADES-NDA '18: Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)June 2018Article No.: 3Pages 1–10https://doi.org/10.1145/3210259.3210262

Published:10 June 2018Publication History

GRADES-NDA '18: Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)

Pages 1–10

ABSTRACT

Stream-based graph systems continuously ingest graph-changing events via an established input stream, performing the required computation on the corresponding graph. While there are various benchmarking and evaluation approaches for traditional, batch-oriented graph processing systems, there are no common procedures for evaluating stream-based graph systems. We, therefore, present GraphTides, a generic framework which includes the definition of an appropriate system model, an exploration of the parameter space, suitable workloads, and computations required for evaluating such systems. Furthermore, we propose a methodology and provide an architecture for running experimental evaluations. With our framework, we hope to systematically support system development, performance measurements, engineering, and comparisons of stream-based graph systems.

References

Daniel J. Abadi, Don Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik. 2003. Aurora: a new model and architecture for data stream management. the VLDB Journal 12, 2 (2003), 120--139. Google ScholarDigital Library
Khaled Ammar and M Tamer Özsu. 2013. WGB: towards a universal graph benchmark. In Workshop on Big Data Benchmarks. Springer, 58--72.Google Scholar
Arvind Arasu, Mitch Cherniack, Eduardo Galvez, David Maier, Anurag S Maskey, Esther Ryvkina, Michael Stonebraker, and Richard Tibbetts. 2004. Linear road: a stream data management benchmark. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. VLDB Endowment, 480--491. Google ScholarDigital Library
Timothy G. Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. LinkBench: A Database Benchmark Based on the Facebook Social Graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, New York, NY, USA, 1185--1196. Google ScholarDigital Library
David A. Bader, John Feo, John Gilbert, Jeremy Kepner, David Koester, Eugene Loh, Kamesh Madduri, Bill Mann, Theresa Meuse, and Eric Robinson. 2009. HPC Scalable Graph Analysis Benchmark. (2009).Google Scholar
Christian Bizer and Andreas Schultz. 2009. The Berlin SPARQL Benchmark. International Journal on Semantic Web and Information Systems (IJSWIS) 5, 2 (2009), 1--24.Google ScholarCross Ref
Mihai Capotă, Tim Hegeman, Alexandru Iosup, Arnau Prat-Pérez, Orri Erling, and Peter Boncz. 2015. Graphalytics: A big data benchmark for graph-processing platforms. In Proceedings of the GRADES'15. ACM, 7. Google ScholarDigital Library
Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A recursive model for graph mining. In Proceedings of the 2004 SIAM International Conference on Data Mining. SIAM, 442--446.Google ScholarCross Ref
Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, and Enhong Chen. 2012. Kineograph: Taking the Pulse of a Fast-changing and Connected World. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys '12). ACM, New York, NY, USA, 85--98. Google ScholarDigital Library
Sanket Chintapalli, Derek Dagit, Bobby Evans, Reza Farivar, Thomas Graves, Mark Holderbaugh, Zhuo Liu, Kyle Nusbaum, Kishorkumar Patil, Boyang Jerry Peng, et al. 2016. Benchmarking streaming computation engines: storm, flink and spark streaming. In Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE, 1789--1792.Google ScholarCross Ref
Marek Ciglan, Alex Averbuch, and Ladialav Hluchy. 2012. Benchmarking traversal operations over graph databases. In Data Engineering Workshops (ICDEW), 2012 IEEE 28th International Conference on. IEEE, 186--189. Google ScholarDigital Library
Graph 500 Steering Committee. 2017. Graph 500 Benchmarks v2.0. https://graph500.org/. (June 2017).Google Scholar
Miyuru Dayarathna and Toyotaro Suzumura. 2012. XGDBench: A benchmarking platform for graph stores in exascale clouds. In Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on. IEEE, 363--370. Google ScholarDigital Library
Ayush Dubey, Greg D Hill, Robert Escriva, and Emin Gün Sirer. 2016. Weaver: a high-performance, transactional graph database based on refinable timestamps. Proceedings of the VLDB Endowment 9, 11 (2016), 852--863. Google ScholarDigital Library
Benjamin Erb, Dominik Meissner, Jakob Pietron, and Frank Kargl. 2017. Chronograph: A Distributed Processing Platform for Online and Batch Computations on Event-sourced Graphs. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems (DEBS '17). ACM, New York, NY, USA, 78--87. Google ScholarDigital Library
Orri Erling, Alex Averbuch, Josep Larriba-Pey, Hassan Chafi, Andrey Gubichev, Arnau Prat, Minh-Duc Pham, and Peter Boncz. 2015. The LDBC Social Network Benchmark: Interactive Workload. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 619--630. Google ScholarDigital Library
Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, and Hans-Arno Jacobsen. 2013. BigBench: towards an industry standard benchmark for big data analytics. In Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 1197--1208. Google ScholarDigital Library
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). USENIX, Hollywood, CA, 17--30. Google ScholarDigital Library
Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph Processing in a Distributed Dataflow Framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI'14). USENIX Association, Berkeley, CA, USA, 599--613. Google ScholarDigital Library
Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin. 2005. LUBM: A benchmark for OWL knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web 3, 2-3 (2005), 158--182. Google ScholarDigital Library
Yong Guo, Ana Lucia Varbanescu, Alexandru Iosup, Claudio Martella, and Theodore L Willke. 2014. Benchmarking graph-processing platforms: a vision. In Proceedings of the 5th ACM/SPEC international conference on Performance engineering. ACM, 289--292. Google ScholarDigital Library
Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, and Enhong Chen. 2014. Chronos: AGraph Engine for Temporal Graph Analysis. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys '14). ACM, New York, NY, USA, Article 1, 14 pages. Google ScholarDigital Library
Thomas Hartmann, Francois Fouquet, Matthieu Jimenez, Romain Rouvoy, and Yves Le Traon. 2017. Analyzing ComplexData in Motion at Scale with Temporal Graphs. In The 29th International Conference on Software Engineering & Knowledge Engineering (SEKE'17). KSI Research, 6.Google ScholarCross Ref
Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2010. The Hi-Bench benchmark suite: Characterization of the MapReduce-based data analysis. In Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on. IEEE, 41--51.Google ScholarCross Ref
IEEE. 2008. IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems. IEEE Std 1588-2008 (Revision of IEEE Std 1588-2002) (July 2008), 1--300.Google Scholar
Alexandru Iosup, Tim Hegeman, Wing Lung Ngai, Stijn Heldens, Arnau Prat-Pérez, Thomas Manhardto, Hassan Chafio, Mihai Capotă, Narayanan Sundaram, Michael Anderson, et al. 2016. Ldbc graphalytics: A benchmark for large-scale graph analysis on parallel and distributed platforms. Proceedings of the VLDB Endowment 9, 13 (2016), 1317--1328. Google ScholarDigital Library
Anand Padmanabha Iyer, Li Erran Li, Tathagata Das, and Ion Stoica. 2016. Time-evolving graph processing at scale. In Proc. of the 4th International Workshop on Graph Data Management Experiences and Systems. ACM, 5. Google ScholarDigital Library
Raj Jain. 1991. The Art of Computer Systems Performance Analysis - Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley.Google Scholar
Ivo Jimenez, Michael Sevilla, Noah Watkins, Carlos Maltzahn, Jay Lofstead, Kathryn Mohror, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2017. The popper convention: Making reproducible systems evaluation practical. In Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International. IEEE, 1561--1570.Google ScholarCross Ref
Martin Junghanns, André Petermann, Martin Neumann, and Erhard Rahm. 2017. Management and Analysis of Big Graph Data: Current Systems and Open Challenges. In Handbook of Big Data Technologies. Springer, 457--505.Google Scholar
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD '05). ACM, New York, NY, USA, 177--187. Google ScholarDigital Library
Min Li, Jian Tan, Yandong Wang, Li Zhang, and Valentina Salapura. 2015. Spark-bench: a comprehensive benchmarking suite for in memory data analytic platform spark. In Proceedings of the 12th ACM International Conference on Computing Frontiers. ACM, 53. Google ScholarDigital Library
Ruirui Lu, Gang Wu, Bin Xie, and Jingtong Hu. 2014. Stream bench: Towards benchmarking modern distributed stream computing frameworks. In Utility and Cloud Computing (UCC), 2014 IEEE/ACM 7th International Conference on. IEEE, 69--78. Google ScholarDigital Library
Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A System for Large-scale Graph Processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD '10). ACM, New York, NY, USA, 135--146. Google ScholarDigital Library
Andrew McGregor. 2014. Graph Stream Algorithms: A Survey. SIGMOD Rec. 43, 1 (May 2014), 9--20. Google ScholarDigital Library
Frank McSherry, Michael Isard, and Derek G. Murray. 2015. Scalability! But at what COST?. In 15th Workshop on Hot Topics in Operating Systems. USENIX Association, Kartause Ittingen, Switzerland. Google ScholarDigital Library
Othon Michail and Paul G. Spirakis. 2018. Elements of the Theory of Dynamic Networks. Commun. ACM 61, 2 (Jan. 2018), 72--72. Google ScholarDigital Library
Rajeev Motwani, Jennifer Widom, Arvind Arasu, Brian Babcock, Shivnath Babu, Mayur Datar, Gurmeet Manku, Chris Olston, Justin Rosenstein, and Rohit Varma. 2003. Query Processing, Resource Management, and Approximation in a Data Stream Management System-. In IN CIDR. Citeseer.Google Scholar
Wing Lung Ngai, Tim Hegeman, Stijn Heldens, and Alexandru Iosup. 2017. Granula: Toward Fine-grained Performance Analysis of Large-scale Graph Processing Platforms. In Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems (GRADES'17). ACM, New York, NY, USA, Article 8, 6 pages. Google ScholarDigital Library
Anil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu. 2017. Do We Need Specialized Graph Databases?: Benchmarking Real-Time Social Networking Applications. In Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems (GRADES'17). ACM, New York, NY, USA, Article 12, 7 pages. Google ScholarDigital Library
Himchan Park and Min-Soo Kim. 2017. TrillionG: A Trillion-scale Synthetic Graph Generator Using a Recursive Vector Model. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). ACM, New York, NY, USA, 913--928. Google ScholarDigital Library
Arnau Prat-Pérez, Joan Guisado-Gámez, Xavier Fernández Salas, Petr Koupy, Siegfried Depner, and Davide Basilio Bartolini. 2017. Towards a Property Graph Generator for Benchmarking. In Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems (GRADES'17). ACM, New York, NY, USA, Article 6, 6 pages. Google ScholarDigital Library
Jari Saramäki and Esteban Moro. 2015. From seconds to months: an overview of multi-scale dynamics of mobile telephone calls. The European Physical Journal B 88, 6 (24 Jun 2015), 164.Google ScholarCross Ref
Yogesh Simmhan, Alok Kumbhare, Charith Wickramaarachchi, Soonil Nagarkar, Santosh Ravi, Cauligi Raghavendra, and Viktor Prasanna. 2014. GoFFish: A Subgraph Centric Framework for Large-Scale Graph Analytics. In Euro-Par 2014 Parallel Processing, Fernando Silva, Inês Dutra, and Vítor Santos Costa (Eds.). Lecture Notes in Computer Science, Vol. 8632. Springer International Publishing, 451--462.Google Scholar
Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and Accurate Computations on Streaming Graphs via Trimmed Approximations. SIGARCH Comput. Archit. News 45, 1 (April 2017), 237--251. Google ScholarDigital Library
Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, Wanling Gao, Zhen Jia, Yingjie Shi, Shujie Zhang, et al. 2014. Bigdatabench: A big data benchmark suite from internet services. In High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium on. IEEE, 488--499.Google ScholarCross Ref
Christo Wilson, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Y. Zhao. 2012. Beyond Social Graphs: User Interactions in Online Social Networks and Their Implications. ACM Trans. Web 6, 4, Article 17 (Nov. 2012), 31 pages. Google ScholarDigital Library

Index Terms

Graphtides: a framework for evaluating stream-based graph processing platforms

Recommendations

Distributed temporal graph analytics with GRADOOP
Abstract
Temporal property graphs are graphs whose structure and properties change over time. Temporal graph datasets tend to be large due to stored historical information, asking for scalable analysis capabilities. We give a complete overview of Gradoop, ...
Read More
Query-Driven Graph Processing
WWW '22: Companion Proceedings of the Web Conference 2022

Graphs are data model abstractions that are becoming pervasive in several real-life applications and practical use cases. In these settings, users primarily focus on entities and their relationships, further enhanced with multiple labels and properties ...
Read More
Synergistic Analysis of Evolving Graphs

Evolving graph processing involves repeating analyses, which are often iterative, over multiple snapshots of the graph corresponding to different points in time. Since the snapshots of an evolving graph share a great number of vertices and edges, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GRADES-NDA '18: Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)
June 2018
94 pages
ISBN:9781450356954
DOI:10.1145/3210259
Editors:
Akhil Arora
American Express Big Data Labs
,
Arnab Bhattacharya
Indian Institute of Technology, Kanpur, India
,
George Fletcher
Eindhoven University of Technology
,
Josep Lluis Larriba Pey
UPC
,
Shourya Roy
American Express Big Data Labs
,
Robert West
EPFL
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 June 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
evaluation
evolving graphs
graph analytics
graph processing
measurements
stream-based graphs
temporal graphs
Qualifiers
- research-article
Conference

Acceptance Rates
GRADES-NDA '18 Paper Acceptance Rate10of26submissions,38%Overall Acceptance Rate29of61submissions,48%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 370
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Graphtides: a framework for evaluating stream-based graph processing platforms

GRADES-NDA '18: Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)

ABSTRACT

References

Cited By

Index Terms

Recommendations

Distributed temporal graph analytics with GRADOOP

Query-Driven Graph Processing

Synergistic Analysis of Evolving Graphs