Abstract
Temporal graphs, with a time dimension, are attracting increasing interest from research communities. Existing temporal graph storage formats mainly include copy-based models, log-based models, and hybrid models that have emerged in recent years. Neither the copy-based model nor the log-based model can trade-off storage and query time well. Hybrid models try to find a compromise between the above two models, but existing models do not consider the skewness of vertex degree in temporal graphs is changing over time. Based on these considerations, we propose LSM-Subgraph, a hybrid storage format that only stores snapshots divided by the fluctuation-aware method and in-between logs. First, LSM-Subgraph uses a PMA-based snapshot creation model to store snapshots based on packed memory arrays (PMA), avoiding rebuilding the whole data structure. Second, LSM-Subgraph uses a select-timepoint method based on fluctuation-aware to divide shards during the update, which achieves a good tradeoff between storage overhead and query time cost. Extensive experimental evaluations over various real-world graphs illustrate that LSM-Subgraph outperforms state-of-the-art temporal graph systems in both memory and time consumption.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bender, M.A., Hu, H.: An adaptive packed-memory array. In: Proceedings of the Twenty-Fifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 20–29. PODS (2006)
De Leo, D., Boncz, P.: Packed memory arrays - rewired. In: 2019 IEEE 35th International Conference on Data Engineering, pp. 830–841 (2019)
Han, W., et al.: Chronos: a graph engine for temporal graph analysis. In: Proceedings of the Ninth European Conference on Computer Systems. EuroSys (2014)
Haubenschild, M., Then, M., Hong, S., Chafi, H.: Asgraph: a mutable multi-versioned graph container with high analytical performance. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, pp. 1–6 (2016)
Holme, P., Saramäki, J.: Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
Itai, A., Konheim, A.G., Rodeh, M.: A sparse table implementation of priority queues. In: Proceedings of the 8th Colloquium on Automata, Languages and Programming, pp. 417–431 (1981)
Ju, X., Williams, D., Jamjoom, H., Shin, K.G.: Version traveler: fast and memory-efficient version switching in graph processing systems. In: 2016 \(\{\)USENIX\(\}\) Annual Technical Conference (\(\{\)USENIX\(\}\)\(\{\)ATC\(\}\) 2016), pp. 523–536 (2016)
Khurana, U., Deshpande, A.: Efficient snapshot retrieval over historical graph data (2013)
Kumar, P., Huang, H.H.: Graphone: a data store for real-time analytics on evolving graphs. ACM Trans. Storage (2020)
Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: large-scale graph computation on just a PC. In: 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2012), pp. 31–46, October 2012
Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection, June 2014
Macko, P., Marathe, V.J., Margo, D.W., Seltzer, M.I.: Llama: efficient graph analytics using large multiversioned arrays. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 363–374 (2015)
Mariappan, M., Vora, K.: Graphbolt: dependency-driven synchronous processing of streaming graphs. In: Proceedings of the Fourteenth EuroSys Conference 2019. EuroSys (2019)
Nilakant, K., Dalibard, V., Roy, A., Yoneki, E.: Prefedge: SSD prefetcher for large-scale graph traversal. In: Proceedings of International Conference on Systems and Storage, pp. 1–12. SYSTOR (2014)
Ren, C., Lo, E., Kao, B., Zhu, X., Cheng, R.: On querying historical evolving graph sequences. VLDB, 726–737 (2011)
Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 135–146 (2013)
Sikos, L.F., Philp, D.: Provenance-aware knowledge representation: a survey of data models and contextualized knowledge graphs. Data Sci. Eng. 5(3), 293–316 (2020)
Then, M., Kersten, T., Günnemann, S., Kemper, A., Neumann, T.: Automatic algorithm transformation for efficient multi-snapshot analytics on temporal graphs. VLDB, 877–888 (2017)
Toss, J., Pahins, C.A.L., Raffin, B., Comba, J.L.D.: Packed-memory quadtree: a cache-oblivious data structure for visual exploration of streaming spatiotemporal big data. Comput. Graph. 76(NOV.), 117–128 (2018)
Wu, H., Zhao, Y., Cheng, J., Yan, D.: Efficient processing of growing temporal graphs. In: DASFAA, pp. 387–403 (2017)
Yang, J., Yao, W., Zhang, W.: Keyword search on large graphs: a survey. Data Sci. Eng. 6(2), 142–162 (2021)
Ying, T., Chen, H., Jin, H.: Pensieve: skewness-aware version switching for efficient graph processing. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 699–713 (2020)
Zuckerberg, M.: Facebook (2004). http://www.facebook.com
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ma, J. et al. (2023). LSM-Subgraph: Log-Structured Merge-Subgraph for Temporal Graph Processing. In: Li, B., Yue, L., Tao, C., Han, X., Calvanese, D., Amagasa, T. (eds) Web and Big Data. APWeb-WAIM 2022. Lecture Notes in Computer Science, vol 13421. Springer, Cham. https://doi.org/10.1007/978-3-031-25158-0_39
Download citation
DOI: https://doi.org/10.1007/978-3-031-25158-0_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25157-3
Online ISBN: 978-3-031-25158-0
eBook Packages: Computer ScienceComputer Science (R0)