Skip to main content
Log in

Graph database benchmarking on cloud environments with XGDBench

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Online graph database service providers have started migrating their operations to public clouds due to the increasing demand for low-cost, ubiquitous graph data storage and analysis. However, there is little support available for benchmarking graph database systems in cloud environments. We describe XGDBench which is a graph database benchmarking platform for cloud computing systems. XGDBench has been designed with the aim of creating an extensible platform for graph database benchmarking which makes it suitable for benchmarking future HPC systems. We extend the Yahoo! Cloud Serving Benchmark (YCSB) to the area of graph database benchmarking by creation of XGDBench. The benchmarking platform is written in X10 which is a PGAS language intended for programming future HPC systems. We describe the architecture of the XGDBench and explain how it differs from the current state-of-the-art. We conduct performance evaluation of five famous graph data stores AllegroGraph, Fuseki, Neo4j, OrientDB, and Titan using XGDBench on Tsubame 2.0 HPC cloud environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • AllegroGraph: AllegroGraph RDF Store web 3.0’s database. http://www.franz.com/agraph/allegrograph/ (2013)

  • Angles, R.: A comparison of current graph database models. In: IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 171–177 (2012)

    Google Scholar 

  • Apache: Fuseki: serving RDF data over http. URL: http://jena.apache.org/documentation/serving_data/ (2012)

  • Aurelius: Rexster. URL: https://github.com/tinkerpop/rexster/wiki (2012a)

  • Aurelius: Titan: distributed graph database. URL: http://thinkaurelius.github.com/titan/ (2012b)

  • Aurelius: Rexpro. URL: https://github.com/tinkerpop/rexster/wiki/RexPro (2013)

  • Bader, D.A., Feo, J., Gilbert, J., Kepner, J., Koester, D., Loh, E., Madduri, K., Mann, B., Meuse, T., Robinson, E.: HPC scalable graph analysis benchmark (2009)

  • Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst. 5(2), 1–24 (2009)

    Article  Google Scholar 

  • Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-Mat: a recursive model for graph mining. In: SDM (2004)

    Google Scholar 

  • Chakrabarti, D., Faloutsos, C., McGlohon, M.: Graph mining: laws and generators. In: Aggarwal, C.C., Wang, H., Elmagarmid, A.K. (eds.) Managing and Mining Graph Data. The Kluwer International Series on Advances in Database Systems, vol. 40, pp. 69–123. Springer, New York (2010)

    Chapter  Google Scholar 

  • Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’05), pp. 519–538. ACM, New York (2005)

    Chapter  Google Scholar 

  • Ciglan, M., Averbuch, A., Hluchy, L.: Benchmarking traversal operations over graph databases. In: IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 186–189 (2012)

    Google Scholar 

  • CloudGraph: CloudGraph.net graph database. URL: http://www.cloudgraph.com/ (2012)

  • Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC ’10), pp. 143–154. ACM, New York (2010). doi:10.1145/1807128.1807152

    Chapter  Google Scholar 

  • Cudré-Mauroux, P., Elnikety, S.: Graph data management systems for new application domains. Proc. VLDB Endow. 4(12), 1510–1511 (2011)

    Google Scholar 

  • Dayarathna, M., Suzumura, T.X.: XGDBench: A benchmarking platform for Graph stores in exascale clouds. In: IEEE 4th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 363–370 (2012)

    Chapter  Google Scholar 

  • Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazán, N., Larriba-Pey, J.L.: Survey of graph database performance on the HPC scalable graph analysis benchmark. In: Proceedings of the 2010 International Conference on Web-Age Information Management (WAIM’10), pp. 37–48. Springer, Berlin (2010)

    Google Scholar 

  • Dominguez-Sal, D., Martinez-Bazan, N., Muntes-Mulero, V., Baleta, P., Larriba-Pay, J.L.: A discussion on the design of graph database benchmarks. In: Proceedings of the Second TPC Technology Conference on Performance Evaluation, Measurement and Characterization of Complex Systems (TPCTC’10), pp. 25–40. Springer, Berlin (2011)

    Chapter  Google Scholar 

  • Dongarra, J., et al.: The international exascale software project roadmap. Int. J. High Perform. Comput. Appl. 25(1), 3–60 (2011)

    Article  Google Scholar 

  • Dudley, J., Pouliot, Y., Chen, R., Morgan, A., Butte, A.: Translational bioinformatics in the cloud: an affordable alternative. Genome Med. 2(8), 51 (2010)

    Article  Google Scholar 

  • Dydra: Dydra: networks made friendly. URL: http://dydra.com/ (2012)

  • Ekins, S., Gupta, R., Gifford, E., Bunin, B., Waller, C.: Chemical space: missing pieces in cheminformatics. Pharm. Res. 27, 2035–2039 (2010)

    Article  Google Scholar 

  • Endo, T., Nukada, A., Matsuoka, S., Maruyama, N.: Linpack evaluation on a supercomputer with heterogeneous accelerators. In: IPDPS, pp. 1–8 (2010)

    Google Scholar 

  • Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the Internet topology. Comput. Commun. Rev. 29(4), 251–262 (1999)

    Article  Google Scholar 

  • FlockDB: FlockDB. URL: https://github.com/twitter/flockdb (2013)

  • Gremlin: Gremlin. URL: https://github.com/tinkerpop/gremlin/wiki/ (2013)

  • Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for owl knowledge base systems. J. Web Semant. 3(2–3), 158–182 (2005)

    Article  Google Scholar 

  • Holzschuher, F., Peinl, R.: Performance of graph query languages: comparison of cypher, gremlin and native access in Neo4j. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops (EDBT ’13), pp. 195–204. ACM, New York (2013)

    Chapter  Google Scholar 

  • Huppler, K.: Performance Evaluation and Benchmarking. Chap. The Art of Building a Good Benchmark pp. 18–30. Springer, Berlin (2009)

    Google Scholar 

  • IBM: X10: performance and productivity at scale. URL: http://x10-lang.org/ (2012)

  • Leskovec, J., Huttenlocher, D., Kleinberg, J.: Signed networks in social media. In: Proceedings of the 28th International Conference on Human Factors in Computing Systems (CHI ’10), pp. 1361–1370. ACM, New York (2010)

    Chapter  Google Scholar 

  • Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., Liu, S.: Towards a complete owl ontology benchmark. In: Sure, Y., Domingue, J. (eds.) The Semantic Web: Research and Applications. Lecture Notes in Computer Science, vol. 4011, pp. 125–139. Springer, Berlin (2006)

    Google Scholar 

  • Morsey, M., Lehmann, J., Auer, S., Ngomo, A.C.N.: DBpedia SPARQL benchmark—performance assessment with real queries on real data. In: International Semantic Web Conference (1)’11, pp. 454–469 (2011)

    Google Scholar 

  • Murphy, R., Berry, J., McLendon, W., Hendrickson, B., Gregor, D., Lumsdaine, A.: DFS: a simple to write yet difficult to execute benchmark. In: IEEE International Symposium on Workload Characterization, pp. 175–177 (2006)

    Google Scholar 

  • Myunghwan, K., Leskovec, J.: Multiplicative attribute Graph model of real-world networks. Internet Math. 8(1–2), 113–160 (2012)

    MathSciNet  MATH  Google Scholar 

  • Nambiar, R., Wakou, N., Carman, F., Majdalany, M.: Transaction processing performance council (tpc): state of the council 2010. In: Nambiar, R., Poess, M. (eds.) Performance Evaluation, Measurement and Characterization of Complex Systems. Lecture Notes in Computer Science, vol. 6417, pp. 1–9. Springer, Berlin (2011)

    Chapter  Google Scholar 

  • Neo4j: Neo4j Heroku add-on. URL: http://www.neo4j.org/develop/heroku (2012)

  • Newmann, M.: Networks: An Introduction. Oxford University Press, Oxford (2010)

    Book  Google Scholar 

  • NuvolaBase: NuvolaBase: cloudize your data—commercial support, training and services about OrientDB. URL: http://www.nuvolabase.com/site/ (2012)

  • Orient Technologies, O.: OrientDB graph-document NoSQl dbms. URL: http://www.orientdb.org/ (2013)

  • Partner, J., Vukotic, A., Watt, N.: Neo4j in Action. Manning Publications Co. (2012)

  • Robinson, I., Webber, J., Eifrem, E.: Graph Databases. O’Reilly, Sebastopol (2013)

    Google Scholar 

  • Rohloff, K., Dean, M., Emmons, I., Ryder, D., Sumner, J.: An evaluation of triple-store technologies for large data stores. In: On the Move to Meaningful Internet Systems 2007: OTM 2007 Workshops. Lecture Notes in Computer Science, vol. 4806, pp. 1105–1114. Springer, Berlin (2007)

    Chapter  Google Scholar 

  • Sakr, S., Liu, A.: SLA-based and consumer-centric dynamic provisioning for cloud databases. In: IEEE 5th International Conference on Cloud Computing, pp. 360–367 (2012)

    Google Scholar 

  • Sarwat, M., Elnikety, S., He, Y., Kliot, G.H.: Horton: Online query execution engine for large distributed graphs. In: ICDE, pp. 1289–1292 (2012)

    Google Scholar 

  • Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: Sp2bench: a SPARQL performance benchmark. CoRR abs/0806.4627 (2008)

  • Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., Ideker, T.: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13(11), 2498–2504 (2003)

    Article  Google Scholar 

  • Shao, B., Wang, H., Xiao, Y.: Managing and mining large graphs: systems and implementations. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD ’12), pp. 589–592. ACM, New York (2012)

    Chapter  Google Scholar 

  • Thakker, D., Osman, T., Gohil, S., Lakin, P.: A pragmatic approach to semantic repositories benchmarking. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) The Semantic Web: Research and Applications. Lecture Notes in Computer Science, vol. 6088, pp. 379–393. Springer, Berlin (2010)

    Google Scholar 

  • The Apache Software Foundation, T.A.S.: Cassandra. URL: http://cassandra.apache.org/ (2013a)

  • The Apache Software Foundation: Shindig—welcome to Apache Shindig. URL: http://shindig.apache.org/ (2013b)

  • Versaci, F., Pingali, K.: Processor allocation for optimistic parallelization of irregular programs. In: Proceedings of the 12th International Conference on Computational Science and Its Applications, Part I (ICCSA’12), pp. 1–14. Springer, Berlin (2012)

    Google Scholar 

  • Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th Annual Southeast Regional Conference (ACM SE ’10), pp. 42:1–42:6. ACM, New York (2010)

    Google Scholar 

  • W3C: Rdf primer. URL: http://www.w3.org/TR/rdf-primer/ (2013)

  • Wang, J.: Sequential patterns. In: Liu, L., Özsu, M. (eds.) Encyclopedia of Database Systems, pp. 2621–2625. Springer, New York (2009)

    Google Scholar 

  • Zhao, Z., Liu, J., Crespi, N.: The design of activity-oriented social networking: Dig-event. In: Proceedings of the 13th International Conference on Information Integration and Web-Based Applications and Services (iiWAS ’11), pp. 420–425. ACM, New York (2011)

    Google Scholar 

Download references

Acknowledgements

This research was supported by the Japan Science and Technology Agency’s CREST project titled “Development of System Software Technologies for post-Peta Scale High Performance Computing”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miyuru Dayarathna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dayarathna, M., Suzumura, T. Graph database benchmarking on cloud environments with XGDBench. Autom Softw Eng 21, 509–533 (2014). https://doi.org/10.1007/s10515-013-0138-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10515-013-0138-7

Keywords

Navigation