Hinode: implementing a vertex-centric modelling approach to maintaining historical graph data

Kosmatopoulos, Andreas; Gounaris, Anastasios; Tsichlas, Kostas

doi:10.1007/s00607-019-00715-6

Hinode: implementing a vertex-centric modelling approach to maintaining historical graph data

Published: 27 March 2019

Volume 101, pages 1885–1908, (2019)
Cite this article

Computing Aims and scope Submit manuscript

Andreas Kosmatopoulos ORCID: orcid.org/0000-0001-5334-741X¹,
Anastasios Gounaris¹ &
Kostas Tsichlas¹

301 Accesses
Explore all metrics

Abstract

Over the past few years, there has been a rapid increase of data originating from evolving networks such as social networks, sensor networks and others. A major challenge that arises when handling such networks and their respective graphs is the ability to issue a historical query on their data, that is, a query that is concerned with the state of the graph at previous time instances. While there has been a number of works that index the historical data in a time-centric manner (i.e. according to the time instance an update event occurs), in this work, we focus on the less-explored vertex-centric storage approach (i.e. according to the entity in which an update event occurs). We demonstrate that the design choices for a vertex-centric model are not trivial, by proposing two different modelling and storage models that leverage NoSQL technology and investigating their tradeoffs. More specifically, we experimentally evaluate the two models and show that under certain cases, their relative performance can differ by several times. Finally, we provide evidence that simple baseline and non-NoSQL solutions are slower by up to an order of magnitude.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MAGMA: Proposing a Massive Historical Graph Management System

GraphTango: A Hybrid Representation Format for Efficient Streaming Graph Updates and Analysis

Article Open access 18 May 2024

HiNode: an asymptotically space-optimal storage model for historical queries on graphs

Article 14 September 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Source code available at https://github.com/hinodeauthors/hinode.
For reasons of clarity edge labels, vertex colors and edge weights are not shown.
The queries executed were Average Vertex Degree, Degree Distribution and One-Hop Neighborhood Retrieval on the last snapshot of the sequence—see Sect. 5.1.
citHep-Th SNAP Dataset [11]—see Sect. 5.1.
Source code available at https://github.com/akosmato/HinodeNoSQL.
Due to difficulties in assigning some specific edges to a particular snapshot we removed $0.4\%$ of the total edges in the “hep-th” and “hep-ph” datasets and $0.04\%$ edges of the “USPatents” dataset.
As an example, a sequence of 100 snapshots that gets indexed every 20 snapshots would be comprised of five smaller ST indices whereas a vertex that exists in the first 75 snapshots would only be present in the first 4 smaller ST indices.
e.g. A performance ratio of 2 corresponds to MT requiring half the execution time of ST.
https://neo4j.com/.
Measured through the “nodetool” utility.

References

Akiba T, Iwata Y, Yoshida Y (2014) Dynamic and historical shortest-path distance queries on large evolving networks by pruned landmark labeling. In: 23rd international world wide web conference, WWW’14, pp 237–248
Apache Giraph. http://giraph.apache.org/. Accessed 12 July 2018
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Article MathSciNet Google Scholar
Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. OSDI 14:599–613
Google Scholar
Huo W, Tsotras VJ (2014) Efficient temporal shortest path queries on evolving social graphs. In: Conference on scientific and statistical database management, SSDBM ’14, pp 38:1–38:4
Khurana U, Deshpande A (2013) Efficient snapshot retrieval over historical graph data. In: 29th IEEE international conference on data engineering, ICDE 2013, Brisbane, April 8–12, pp 997–1008
Khurana U, Deshpande A (2016) Storing and analyzing historical graph data at scale. In: Proceedings of the 19th international conference on extending database technology, EDBT 2016, pp 65–76
Kosmatopoulos A, Giannakopoulou K, Papadopoulos AN, Tsichlas K (2016) An overview of methods for handling evolving graph sequences. In: Algorithmic aspects of cloud computing, pp 181–192. Springer, Berlin
Kosmatopoulos A, Tsichlas K, Gounaris A, Sioutas S, Pitoura E (2017) Hinode: an asymptotically space-optimal storage model for historical queries on graphs. Distrib Parallel Databases 35:249. https://doi.org/10.1007/s10619-017-7207-z
Article Google Scholar
Labouseur AG, Birnbaum J, Olsen PW, Spillane SR, Vijayan J, Hwang J, Han W (2015) The g* graph database: efficiently managing large distributed dynamic graphs. Distrib and Parallel Databases 33(4):479–514
Article Google Scholar
Leskovec J, Krevl A (2014) SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, pp 135–146. ACM
Ren C, Lo E, Kao B, Zhu X, Cheng R (2011) On querying historical evolving graph sequences. PVLDB 4(11):726–737
Google Scholar
Salzberg B, Tsotras VJ (1999) Comparison of access methods for time-evolving data. ACM Comput Surv (CSUR) 31(2):158–221
Article Google Scholar
Semertzidis K, Pitoura E (2016) Durable graph pattern queries on historical graphs. In: 32nd IEEE international conference on data engineering, ICDE 2016, Helsinki, May 16–20, 2016, pp 541–552
Semertzidis K, Pitoura E, Lillis K (2015) Timereach: historical reachability queries on evolving graphs. In: Proceedings of the 18th international conference on extending database technology, EDBT 2015, Brussels, Belgium, March 23–27, pp 121–132
Shao B, Wang H, Li Y (2013) Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD 2013, pp 505–516
Spillane SR, Birnbaum J, Bokser D, Kemp D, Labouseur AG, Olsen PW, Vijayan J, Hwang J, Yoon J (2013) A demonstration of the $\text{G}_{\ast }$ graph database system. In: 29th IEEE international conference on data engineering, ICDE 2013, Brisbane, April 8–12, pp 1356–1359
Yang Y, Yu JX, Gao H, Pei J, Li J (2014) Mining most frequently changing component in evolving graphs. World Wide Web 17(3):351–376
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Aristotle University of Thessaloniki, Thessaloníki, Greece
Andreas Kosmatopoulos, Anastasios Gounaris & Kostas Tsichlas

Authors

Andreas Kosmatopoulos
View author publications
You can also search for this author inPubMed Google Scholar
Anastasios Gounaris
View author publications
You can also search for this author inPubMed Google Scholar
Kostas Tsichlas
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Andreas Kosmatopoulos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kosmatopoulos, A., Gounaris, A. & Tsichlas, K. Hinode: implementing a vertex-centric modelling approach to maintaining historical graph data. Computing 101, 1885–1908 (2019). https://doi.org/10.1007/s00607-019-00715-6

Download citation

Received: 16 July 2018
Accepted: 21 March 2019
Published: 27 March 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s00607-019-00715-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hinode: implementing a vertex-centric modelling approach to maintaining historical graph data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MAGMA: Proposing a Massive Historical Graph Management System

GraphTango: A Hybrid Representation Format for Efficient Streaming Graph Updates and Analysis

HiNode: an asymptotically space-optimal storage model for historical queries on graphs

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now