ABSTRACT
Graph processing is increasingly popular given the wide range of phenomena represented as graphs (e.g., social media networks, pharmaceutical drug compounds, or fraud networks, among others). The increasing amount of data available requires new approaches to efficiently ingest and process such data. In this research, we describe a solution at a conceptual level in the context of the Graph-Massivizer architecture. Graph-Inceptor aims to bridge the void among ETL tools enabling data transformations required for graph creation and enrichment and supporting connectors to multiple graph storages at a massive scale. Furthermore, it aims to enhance ETL operations by learning from data content and load and making decisions based on machine-learning-based predictive analytics.
- Nesreen K Ahmed, Nick Duffield, Jennifer Neville, and Ramana Kompella. 2014. Graph sample and hold: A framework for big-graph analytics. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 1446--1455.Google ScholarDigital Library
- Nesreen K Ahmed, Nick Duffield, Theodore Willke, and Ryan A Rossi. 2017. On sampling from massive graph streams. arXiv preprint arXiv:1703.02625 (2017).Google Scholar
- Maciej Besta, Dimitri Stanojevic, Johannes De Fine Licht, Tal Ben-Nun, and Torsten Hoefler. 2019. Graph processing on fpgas: Taxonomy, survey, challenges. arXiv preprint arXiv:1903.06697 (2019).Google Scholar
- P. Boldi and S. Vigna. 2004. The Webgraph Framework I: Compression Techniques. In Proceedings of the 13th International Conference on World Wide Web (New York, NY, USA) (WWW '04). Association for Computing Machinery, New York, NY, USA, 595--602. https://doi.org/10.1145/988672.988752Google ScholarDigital Library
- Nieves R Brisaboa, Susana Ladra, and Gonzalo Navarro. 2014. Compact representation of web graphs with extended functionality. Information Systems, Vol. 39 (2014), 152--174.Google ScholarDigital Library
- Aydin Bulucc, Henning Meyerhenke, Ilya Safro, Peter Sanders, and Christian Schulz. 2016. Recent advances in graph partitioning. Springer.Google Scholar
- dbt (data build tool. 2023. dbt - transform data in your warehouse. https://www.getdbt.com/Google Scholar
- Arturo Diaz-Perez, Alberto Garcia-Robledo, and Jose-Luis Gonzalez-Compean. 2019. Graph Processing Frameworks. Springer International Publishing, Cham, 875--883. https://doi.org/10.1007/978--3--319--77525--8_283Google ScholarCross Ref
- Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed Deep Graph Learning at Scale.. In OSDI. 551--568.Google Scholar
- Safiollah Heidari, Yogesh Simmhan, Rodrigo N Calheiros, and Rajkumar Buyya. 2018. Scalable graph processing frameworks: A taxonomy and open challenges. ACM Computing Surveys (CSUR), Vol. 51, 3 (2018), 1--53.Google ScholarDigital Library
- Nilesh Jain, Guangdeng Liao, and Theodore L Willke. 2013. Graphbuilder: scalable graph etl framework. In First international workshop on graph data management experiences and systems. 1--6.Google ScholarDigital Library
- Pradeep Kumar and H Howie Huang. 2020. Graphone: A data store for real-time analytics on evolving graphs. ACM Transactions on Storage (TOS), Vol. 15, 4 (2020), 1--40.Google ScholarDigital Library
- Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, and Alex Peysakhovich. 2019. Pytorch-biggraph: A large scale graph embedding system. Proceedings of Machine Learning and Systems, Vol. 1 (2019), 120--131.Google Scholar
- Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. 135--146.Google ScholarDigital Library
- Radu Prodan, Dragi Kimovski, Andrea Bartolini, Michael Cochez, Alexandru Iosup, Evgeny Kharlamov, Jovz e Rovz anec, Laurenct iu Vasiliu, and Ana Lucia Vua rbua nescu. 2022. Towards Extreme and Sustainable Graph Processing for Urgent Societal Challenges in Europe. In 2022 IEEE Cloud Summit. IEEE, 23--30. https://doi.org/10.1109/CloudSummit54781.2022.00010Google ScholarCross Ref
- Dumitru Roman, Nikolay Nikolov, Antoine Putlier, Dina Sukhobok, Brian Elvesæter, Arne Berre, Xianglin Ye, Marin Dimitrov, Alex Simov, Momchill Zarev, et al. 2018. DataGraft: One-stop-shop for open data management. Semantic Web, Vol. 9, 4 (2018), 393--411. https://doi.org/10.3233/SW-170263Google ScholarDigital Library
- Sherif Sakr, Angela Bonifati, Hannes Voigt, Alexandru Iosup, Khaled Ammar, Renzo Angles, Walid Aref, Marcelo Arenas, Maciej Besta, Peter A Boncz, et al. 2021. The future is big graphs: a community view on graph processing systems. Commun. ACM, Vol. 64, 9 (2021), 62--71. https://doi.org/10.1145/3434642Google ScholarDigital Library
- Jiayi Shen and Fabrice Huet. 2018. Predict the best graph partitioning strategy by using machine learning technology. In Proceedings of the 2018 VII International Conference on Network, Communication and Computing. 27--33.Google ScholarDigital Library
- Giuseppe Vietri, Liana V Rodriguez, Wendy A Martinez, Steven Lyons, Jason Liu, Raju Rangaswami, Ming Zhao, and Giri Narasimhan. 2018. Driving Cache Replacement with ML-based LeCaR.. In HotStorage. 928--936.Google Scholar
- Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, Vol. 32, 1 (2020), 4--24.Google ScholarCross Ref
- Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, and George Karypis. 2020. Distdgl: distributed graph neural network training for billion-scale graphs. In 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3). IEEE, 36--44.Google ScholarCross Ref
- Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI open, Vol. 1 (2020), 57--81.Google Scholar
- Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. Aligraph: A comprehensive graph neural network platform. arXiv preprint arXiv:1902.08730 (2019).Google Scholar
- Xiaohan Zou. 2020. A survey on application of knowledge graph. In Journal of Physics: Conference Series, Vol. 1487. IOP Publishing, 012016.Google Scholar
Index Terms
- Graph-Inceptor: Towards Extreme Data Ingestion, Massive Graph Creation and Storage
Recommendations
A Scalable framework for data lakes ingestion
AbstractIn the age of big data, the way we store and analyze heterogeneous data has changed. The complexity of various data inputs in the lakes indicates the significant importance of data ingestion that aids companies in making sense and getting more ...
Knowledge Discovery from Social Graph Data
High volumes of a wide variety of valuable data can be easily collected and generated from a broad range of data sources of different veracities at a high velocity. In the current era of big data, many traditional data management and analytic approaches ...
Towards Dynamic Data Placement for Polystore Ingestion
BIRTE '17: Proceedings of the International Workshop on Real-Time Business Intelligence and AnalyticsIntegrating low-latency data streaming into data warehouse architectures has become an important enhancement to support modern data warehousing applications. In these architectures, heterogeneous workloads with data ingestion and analytical queries must ...
Comments