Abstract
Graph-structured data can be found in nearly every aspect of today’s world which contributes to an increasing importance of this data structure for storing and processing data. From a processing perspective, finding comprehensive patterns in graph-structured data is a processing primitive in a variety of applications, such as fraud detection, biological engineering or social graph analytics. On the hardware side, multiprocessor systems—consisting of multiple processors in a single scale-up server—are the next important wave on top of multi-core systems. In particular, symmetric multiprocessor systems (SMP) are characterized by the fact, that each processor has the same architecture, e.g., every processor is a multi-core and all multiprocessors share a common and huge main memory space. Moreover, large SMPs will feature a non-uniform memory access (NUMA), whose impact on the design of efficient data processing concepts is considerable. In this paper, we give an overview of NeMeSys, our system for scalable near-memory graph pattern matching (GPM) on SMPs. NeMeSys is built on a synthesis of well-known concepts of database systems including a set of graph-tailored and hardware-oriented optimization techniques for scalable GPM on SMPs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Angles, R.: A comparison of current graph database models. In: ICDE Workshop, pp. 171–177 (2012)
Appuswamy, R., Anadiotis, A., Porobic, D., Iman, M., Ailamaki, A.: Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads. PVLDB 11(2), 121–134 (2017)
Bagan, G., Bonifati, A., Ciucanu, R., Fletcher, G.H.L., Lemay, A., Advokaat, N.: gmark: schema-driven generation of graphs and queries. IEEE Trans. Knowl. Data Eng. 29(4), 856–869 (2017)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2011)
Hey, T., Tansley, S., Tolle, K.M. (eds.): The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research (2009)
Hyafil, L., Rivest, R.L.: Graph partitioning and constructing optimal decision trees are polynomial complete problems. IRIA, Laboratoire de Recherche en Informatique et Automatique (1973)
Karypis, G., Kumar, V.: MeTis: unstructured graph partitioning and sparse matrix ordering system, Version 5.1 (2013). http://www.cs.umn.edu/~metis
Kiefer, T., Schlegel, B., Lehner, W.: Experimental evaluation of NUMA effects on database management systems. In: BTW, pp. 185–204 (2013)
Kissinger, T., Kiefer, T., Schlegel, B., Habich, D., Molka, D., Lehner, W.: ERIS: A numa-aware in-memory storage engine for analytical workload. In: ADMS@VLDB, pp. 74–85 (2014)
Krause, A., Ebner, F., Habich, D., Lehner, W.: Trading memory versus workload overhead in graph pattern matching on multiprocessor systems. In: DATA, pp. 400–407 (2019)
Krause, A., Kissinger, T., Habich, D., Lehner, W.: Nemesys - A showcase of data oriented near memory graph processing. In: Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30–July 5, pp. 1945–1948 (2019)
Krause, A., Kissinger, T., Habich, D., Voigt, H., Lehner, W.: Partitioning strategy selection for in-memory graph pattern matching on multiprocessor systems. In: Rivera, F.F., Pena, T.F., Cabaleiro, J.C. (eds.) Euro-Par 2017. LNCS, vol. 10417, pp. 149–163. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64203-1_11
Krause, A., Ungethüm, A., Kissinger, T., Habich, D., Lehner, W.: Asynchronous graph pattern matching on multiprocessor systems. In: Kirikova, M., Nørvåg, K., Papadopoulos, G.A., Gamper, J., Wrembel, R., Darmont, J., Rizzi, S. (eds.) ADBIS 2017. CCIS, vol. 767, pp. 45–53. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67162-8_6
Leis, V., Boncz, P.A., Kemper, A., Neumann, T.: Morsel-driven parallelism: a numa-aware query evaluation framework for the many-core age. In: SIGMOD, pp. 743–754 (2014)
Lu, Y., Cheng, J., Yan, D., Wu, H.: Large-scale distributed graph computing systems: an experimental evaluation. PVLDB 8(3), 281–292 (2014)
McCune, R.R., Weninger, T., Madey, G.: Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput. Surv. 48(2), 25:1–25:39 (2015)
Nguyen, D., Lenharth, A., Pingali, K.: A lightweight infrastructure for graph analytics. In: SOSP, pp. 456–471 (2013)
Ogata, H., Fujibuchi, W., Goto, S., Kanehisa, M.: A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters. Nucleic Acids Res. 28(20), 4021–4028 (2000)
Otte, E., Rousseau, R.: Social network analysis: a powerful strategy, also for the information sciences. J. Inf. Sci. 28(6), 441–453 (2002)
Pandis, I., Johnson, R., Hardavellas, N., Ailamaki, A.: Data-oriented transaction execution. PVLDB 3(1), 928–939 (2010)
Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: NetProbe: a fast and scalable system for fraud detection in online auction networks. In: WWW, pp. 201–210 (2007)
Rother, C., Kolmogorov, V., Blake, A.: “grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)
Sahu, S., Mhedhbi, A., Salihoglu, S., Lin, J., Özsu, M.T.: The ubiquity of large graphs and surprising challenges of graph processing. PVLDB 11(4), 420–431 (2017)
Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: PPoPP, pp. 135–146 (2013)
Tas, M.K., Kaya, K., Saule, E.: Greed is good: optimistic algorithms for bipartite-graph partial coloring on multicore architectures. CoRR abs/1701.02628 (2017)
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. PVLDB 1(1), 1008–1019 (2008)
Wood, P.T.: Query languages for graph databases. SIGMOD Rec. 41(1), 50–60 (2012)
Yan, D., Cheng, J., Lu, Y., Ng, W.: Effective techniques for message reduction and load balancing in distributed graph computation. In: WWW, pp. 1307–1317 (2015)
Zhang, K., Chen, R., Chen, H.: NUMA-aware graph-structured analytics. In: PPoPP, pp. 183–193 (2015)
Acknowledgment
This work was partly funded by the German Research Foundation (DFG) within the CRC 912 (HAEC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Krause, A., Habich, D., Lehner, W. (2020). Scalable In-Memory Graph Pattern Matching on Symmetric Multiprocessor Systems. In: Qin, L., et al. Software Foundations for Data Interoperability and Large Scale Graph Data Analytics. SFDI LSGDA 2020 2020. Communications in Computer and Information Science, vol 1281. Springer, Cham. https://doi.org/10.1007/978-3-030-61133-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-61133-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61132-3
Online ISBN: 978-3-030-61133-0
eBook Packages: Computer ScienceComputer Science (R0)