ABSTRACT
Graph database systems (GDBs) allow efficiently storing and retrieving graph data, and have become the critical component in many applications, e.g., knowledge graphs, social networks, and fraud detection. It is important to ensure that GDBs operate correctly. Logic bugs can occur and make GDBs return an incorrect result for a given query. These bugs are critical and can easily go unnoticed by developers when the graph and queries become complicated. Despite the importance of GDBs, logic bugs in GDBs have received less attention than those in relational database systems.
In this paper, we present Grand, an approach for automatically finding logic bugs in GDBs that adopt Gremlin as their query language. The core idea of Grand is to construct semantically equivalent databases for multiple GDBs, and then compare the results of a Gremlin query on these databases. If the return results of a query on multiple GDBs are different, the likely cause is a logic bug in these GDBs. To effectively test GDBs, we propose a model-based query generation approach to generate valid Gremlin queries that can potentially return non-empty results, and a data mapping approach to unify the format of query results for different GDBs. We evaluate Grand on six widely-used GDBs, e.g., Neo4j and HugeGraph. In total, we have found 21 previously-unknown logic bugs in these GDBs. Among them, developers have confirmed 18 bugs, and fixed 7 bugs.
- 2021. Benchmarking TigerGraph Using the Liked Data Benchmark Council Social Network Benchmark. https://www.tigergraph.com/benchmark/ Google Scholar
- 2021. Cypher Query Language. https://neo4j.com/developer/cypher/ Google Scholar
- 2021. HugeGraph. https://hugegraph.github.io/hugegraph-doc/ Google Scholar
- 2021. HugeGraph-Client Quick Start. https://hugegraph.github.io/hugegraph-doc/quickstart/hugegraph-client.html Google Scholar
- 2021. JaCoCo is a free code coverage library for Java.. https://www.jacoco.org/jacoco/ Google Scholar
- 2021. JanusGraph. https://janusgraph.org Google Scholar
- 2021. Neo4j. https://neo4j.com/ Google Scholar
- 2021. Neo4j-Gremlin. https://github.com/thinkaurelius/neo4j-gremlin-plugin Google Scholar
- 2021. The Next Generation Multi-Model Database Supporting Graphs Key/Value, Documents and Time-Series. https://arcadedb.com/ Google Scholar
- 2021. Open Source, Distributed, Scalable, Lightning Fast. https://nebula-graph.io/ Google Scholar
- 2021. OrientDB. https://orientdb.org Google Scholar
- 2021. RDF Triple Stores vs. Labeled Property Graphs: What’s the Difference? https://neo4j.com/blog/rdf-triple-store-vs-labeled-property-graph-difference/ Google Scholar
- 2021. SQLancer. https://github.com/sqlancer/sqlancer Google Scholar
- 2021. SQLsmith. https://github.com/anse1/sqlsmith Google Scholar
- 2021. TigerGraph. https://www.tigergraph.com/ Google Scholar
- 2021. TinkerGraph. https://github.com/tinkerpop/blueprints/wiki/tinkergraph Google Scholar
- 2021. TinkerPop. https://tinkerpop.apache.org/ Google Scholar
- 2021. TinkerPop Documentation. https://tinkerpop.apache.org/docs/3.4.10/reference/ Google Scholar
- 2021. TinkerPop Github. https://github.com/tinkerpop Google Scholar
- 2021. TITAN: Distributed Graph Database. http://titan.thinkaurelius.com/ Google Scholar
- 2021. What is openCypher? http://www.opencypher.org/ Google Scholar
- 2022. DB-Engines Ranking of Graph DBMS. https://db-engines.com/en/ranking/graph+dbms Google Scholar
- 2022. ISSTA 22 Artifact for "Finding Bugs in Gremlin-Based Graph Database Systems via Randomized Differential Testing". https://doi.org/10.5281/zenodo.6534721 Google ScholarDigital Library
- Ibrahim Abdelaziz, Razen Harbi, Zuhair Khayyat, and Panos Kalnis. 2017. A Survey and Experimental Comparison of Distributed SPARQL Engines for Very Large RDF Data. Proc. VLDB Endow., 10, 13 (2017), 2049–2060. Google ScholarDigital Library
- Ibrahim Abdelaziz, Essam Mansour, Mourad Ouzzani, Ashraf Aboulnaga, and Panos Kalnis. 2017. Query Optimizations over Decentralized RDF Graphs. In International Conference on Data Engineering (ICDE). 139–142. Google ScholarCross Ref
- Renzo Angles, Peter A. Boncz, Josep Lluís Larriba-Pey, Irini Fundulaki, Thomas Neumann, Orri Erling, Peter Neubauer, Norbert Martínez-Bazan, Venelin Kotsev, and Ioan Toma. 2014. The Linked Data Benchmark Council: A Graph and RDF Industry Benchmarking Effort. ACM SIGMOD Record, 43, 1 (2014), 27–31. Google ScholarDigital Library
- Renzo Angles, Arnau Prat-Pérez, David Dominguez-Sal, and Josep Lluís Larriba-Pey. 2013. Benchmarking Database Systems for Social Network Applications. In Proceedings of International Workshop on Graph Data Management Experiences and Systems (GRADES). 15. Google ScholarDigital Library
- Marcelo Arenas, Claudio Gutiérrez, and Juan F. Sequeda. 2021. Querying in the Age of Graph Databases and Knowledge Graphs. In Proceedings of International Conference on Management of Data (SIGMODE). 2821–2828. Google Scholar
- Shafiul Azam Chowdhury, Soumik Mohian, Sidharth Mehra, Siddhant Gawsane, Taylor T. Johnson, and Christoph Csallner. 2018. Automatically finding bugs in a commercial cyber-physical system development tool chain with SLforge. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018. 981–992. Google ScholarDigital Library
- Christoph Csallner and Yannis Smaragdakis. 2004. JCrasher: an automatic robustness tester for Java. Softw. Pract. Exp., 34, 11 (2004), 1025–1050. Google ScholarDigital Library
- Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. 2012. Testing Static Analyzers with Randomly Generated Programs. In NASA Formal Methods - 4th International Symposium (NFM). 120–125. Google Scholar
- Alin Deutsch. 2018. Querying Graph Databases with the GSQL Query Language. In Simpósio Brasileiro de Banco de Dados (SBBD). 313. Google Scholar
- Orri Erling, Alex Averbuch, Josep Lluís Larriba-Pey, Hassan Chafi, Andrey Gubichev, Arnau Prat-Pérez, Minh-Duc Pham, and Peter A. Boncz. 2015. The LDBC Social Network Benchmark: Interactive Workload. In Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD). 619–630. Google Scholar
- Orri Erling and Ivan Mikhailov. 2009. RDF Support in the Virtuoso DBMS. In Networked Knowledge-Networked Media. 7–24. Google Scholar
- Diogo Fernandes and Jorge Bernardino. 2018. Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB. In Proceedings of International Conference on Data Science, Technology and Applications (DATE). 373–380. Google ScholarDigital Library
- Nadime Francis, Alastair Green, Paolo Guagliardo, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Stefan Plantikow, Mats Rydberg, Petra Selmer, and Andrés Taylor. 2018. Cypher: An Evolving Query Language for Property Graphs. In Proceedings of International Conference on Management of Data (SIGMOD). 1433–1445. Google ScholarDigital Library
- Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, and Woon-Hak Kang. 2019. APOLLO: Automatic Detection and Diagnosis of Performance Regressions in Database Systems. Proc. VLDB Endow., 13, 1 (2019), 57–70. Google ScholarDigital Library
- Timotej Kapus and Cristian Cadar. 2017. Automatic testing of symbolic execution engines via program generation and differential testing. In International Conference on Automated Software Engineering (ASE). 590–600. Google ScholarCross Ref
- Shadi Abdul Khalek, Bassem Elkarablieh, Yai O. Laleye, and Sarfraz Khurshid. 2008. Query-Aware Test Generation Using a Relational Constraint Solver. In International Conference on Automated Software Engineering (ASE). 238–247. Google Scholar
- Norman Köster. 2020. An Extensible Graph Query Language for Model-Based Information Retrieval in Intelligent Environments. Ph.D. Dissertation. Bielefeld University, Germany. Google Scholar
- Tibor Kovács, Gábor Simon, and Gergely Mezei. 2019. Benchmarking Graph Database Backends - What Works Well with Wikidata? Acta Cybern., 24, 1 (2019), 43–60. Google ScholarCross Ref
- Matteo Lissandrini, Martin Brugnara, and Yannis Velegrakis. 2018. Beyond Macrobenchmarks: Microbenchmark-Based Graph Database Evaluation. Proc. VLDB Endow., 12, 4 (2018), 390–403. Google ScholarDigital Library
- Baozhu Liu, Xin Wang, Pengkai Liu, Sizhuo Li, Qiang Fu, and Yunpeng Chai. 2021. UniKG: A Unified Interoperable Knowledge Graph Database System. In Proceedings of IEEE International Conference on Data Engineering (ICDE). 2681–2684. Google ScholarCross Ref
- Muhammad Numair Mansur, Maria Christakis, and Valentin Wüstholz. 2021. Metamorphic testing of Datalog engines. In ESEC/FSE ’21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, August 23-28, 2021. 639–650. Google ScholarDigital Library
- William M. McKeeman. 1998. Differential Testing for Software. Digit. Tech. J., 10, 1 (1998), 100–107. Google Scholar
- Anil Pacaci, Alice Zhou, Jimmy Lin, and M. Tamer Özsu. 2017. Do We Need Specialized Graph Databases? Benchmarking Real-Time Social Networking Applications. In Proceedings of International Workshop on Graph Data-management Experiences and Systems (GRADES). 12:1–12:7. Google ScholarDigital Library
- Yuxiang Ren, Hao Zhu, Jiawei Zhang, Peng Dai, and Liefeng Bo. 2021. EnsemFDet: An Ensemble Approach to Fraud Detection based on Bipartite Graph. In International Conference on Data Engineering (ICDE). 2039–2044. Google ScholarCross Ref
- Manuel Rigger and Zhendong Su. 2020. Detecting Optimization Bugs in Database Engines via Non-Optimizing Reference Engine Construction. In Proceedings of ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 1140–1152. Google ScholarDigital Library
- Manuel Rigger and Zhendong Su. 2020. Finding Bugs in Database Systems via Query Partitioning. Proceedings of the ACM on Programming Languages, 4, OOPSLA (2020), Article 211, 30 pages. Google ScholarDigital Library
- Manuel Rigger and Zhendong Su. 2020. Testing Database Engines via Pivoted Query Synthesis. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI). 667–682. Google Scholar
- Marko A. Rodriguez. 2015. The Gremlin Graph Traversal Machine and Language (Invited Talk). In Proceedings of the Symposium on Database Programming Languages. 1–10. Google ScholarDigital Library
- Donald S. Slutz. 1998. Massive Stochastic Testing of SQL. In Proceedings of International Conference on Very Large Data Bases (VLDB). 618–622. Google Scholar
- Thodoris Sotiropoulos, Stefanos Chaliasos, Vaggelis Atlidakis, Dimitris Mitropoulos, and Diomidis Spinellis. 2021. Data-Oriented Differential Testing of Object-Relational Mapping Systems. In Proceedings of IEEE/ACM International Conference on Software Engineering (ICSE). 1535–1547. Google ScholarDigital Library
- Harsh Thakkar, Renzo Angles, Marko Rodriguez, Stephen Mallette, and Jens Lehmann. 2020. Let’s Build Bridges, not Walls: SPARQL Querying of TinkerPop Graph Databases with SPARQL-Gremlin. In Proceedings of IEEE International Conference on Semantic Computing (ICSC). 408–415. Google ScholarCross Ref
- Ran Wang, Zhengyi Yang, Wenjie Zhang, and Xuemin Lin. 2020. An Empirical Study on Recent Graph Database Systems. In Proceedings of International Conference on Knowledge Science, Engineering and Management (KSEM). 328–340. Google ScholarDigital Library
Index Terms
- Finding bugs in Gremlin-based graph database systems via Randomized differential testing
Recommendations
Automatic detection of performance bugs in database systems using equivalent queries
ICSE '22: Proceedings of the 44th International Conference on Software EngineeringBecause modern data-intensive applications rely heavily on database systems (DBMSs), developers extensively test these systems to eliminate bugs that negatively affect functionality. Besides functional bugs, however, there is another important class of ...
Finding bugs in database systems via query partitioning
Logic bugs in Database Management Systems (DBMSs) are bugs that cause an incorrect result for a given query, for example, by omitting a row that should be fetched. These bugs are critical, since they are likely to go unnoticed by users. We propose Query ...
Hunting for bugs in code coverage tools via randomized differential testing
ICSE '19: Proceedings of the 41st International Conference on Software EngineeringReliable code coverage tools are critically important as it is heavily used to facilitate many quality assurance activities, such as software testing, fuzzing, and debugging. However, little attention has been devoted to assessing the reliability of ...
Comments