Abstract
Due to the continuously increasing rate of data production from multiple sources, especially from social media, data analysis techniques are focusing on identifying patterns in the formed graphs and extracting knowledge from them. Most techniques till now, begin with given patterns and calculate the coverage in the graph. Here, we propose a graph mining architecture that focus on finding small sub-graph patterns, referred to as canned pattern, from a database of graphs without any domain knowledge of the graph. These patterns can be used to expedite the query formulation time, increase the domain knowledge and support the data analysis. The canned pattern should maximize coverage and diversity over the graph database while minimizing the cognitive-load of the patterns. The approach presented here is based on an innovative modular architecture that combines state-of-art techniques to extract these patterns and validate the extracted result.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bai, Y., Ding, H., Bian, S., Chen, T., Sun, Y., Wang, W.: Simgnn: a neural network approach to fast graph similarity computation. In: ACM ICWSDM (2019)
Bringmann, B., Nijssen, S.: What is frequent in a single graph? In: PAKDD (2008)
Bunke, H.: What is the distance between graphs. Bull. EATCS 20, 35–39 (1983)
Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recogn. Lett. 19(3), 255–259 (1998)
Fankhauser, S., Riesen, K., Bunke, H.: Speeding up graph edit distance computation through fast bipartite matching. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 102–111. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20844-7_11
Galluccio, L., Michel, O., Comon, P., Hero, A.O.: Graph based k-means clustering. Signal Process. 92(9), 1970–1984 (2012)
Huang, K., Chua, H.E., Bhowmick, S.S., Choi, B., Zhou, S.: Catapult: data-driven selection of canned patterns for efficient visual graph query formulation. In: Proceedings of the 2019 International Conference on Management of Data. SIGMOD 2019, pp. 900–917. Association for Computing Machinery (2019)
Jamshidi, K., Mahadasa, R., Vora, K.: Peregrine. In: Proceedings of the Fifteenth European Conference on Computer Systems, April 2020. https://doi.org/10.1145/3342195.3387548
Kuhn, H.W.: The Hungarian method for the assignment problem. In: Jünger, M., et al. (eds.) 50 Years of Integer Programming 1958-2008, pp. 29–47. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-540-68279-0_2
Langville, A., Meyer, C.: A survey of eigenvector methods of web information retrieval. SIAM Rev. 47(1), 135–161 (2004). https://doi.org/10.1137/S0036144503424786
Munger, A., Bunke, H.: On median graphs: properties, algorithms, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1144–1151 (2001). https://doi.org/10.1109/34.954604
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical Report 1999–66, Stanford InfoLab (1999)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tzanikos, M., Krommyda, M., Kantere, V. (2021). A Highly Modular Architecture for Canned Pattern Selection Problem. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12924. Springer, Cham. https://doi.org/10.1007/978-3-030-86475-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-86475-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86474-3
Online ISBN: 978-3-030-86475-0
eBook Packages: Computer ScienceComputer Science (R0)