Abstract
With the increasing prevalence of data that model relationships between various entities, the use of a graph-based representation for real-world problems offers a logical strategy for organizing information and making knowledge-based decisions. In particular, often it is useful to identify the most frequent patterns or relationships amongst the data in a graph, which requires finding frequent subgraphs. Algorithms for addressing that problem have been proposed for over 15 years. In the worst case, all subgraphs in the graph must be examined, which is exponential in complexity, and subgraph isomorphisms must be computed, which is an NP-complete problem. Frequent subgraph algorithms may attempt to improve the actual runtime performance by reducing the size of the search space, avoiding duplicate comparisons, and/or minimizing the amount of memory required for compiling intermediate results. Herein we present a frequent subgraph mining algorithm that leverages mapping sets in order to eliminate the isomorphism computation during the search for non-edge-disjoint frequent subgraphs. Experimental results show that absence of isomorphism computation leads to much faster frequent subgraph detection when there is a need to identify all occurrences of those subgraphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Google. Inside Search: Algorithms [Online] Written (2012) (accessed: 04–30-2015)
Clement, A.: NSA surveillance: exploring the geographies of internet interception. In: iConference 2014 Proceedings, pp. 412–425 (2014). doi:10.9776/14119
Rhodes, D.R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., Barrette, T., Pander, A., Chinnaiyan, A.M.: ONCOMINE: A cancer microarray database and integrated datamining platform. Neoplasia 6(1), 1–6 (2004). ISSN: 1476-5586 (accessed 04-30-2015). http://dx.doi.org/10.1016/S1476-5586(04)80047-2
Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web. Computer Networks 33 (2000)
Bader, D.A., Madduri, K.: A graph-theoretic analysis of the human proteininteraction network using multicore parallel algorithms. Parallel Comput. (2008)
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On powerlaw relationships of the internet topology. In: SIGCOMM, pp. 251–262, August-September (1999)
Kuramochi, M., Karypis, G.: Finding frequent patterns in a large sparse graph*. Data Mining and Knowledge Discovery 11(3), 243–271 (2005)
Gholami, M., Salajegheh, A.: A survey on algorithms of mining frequent subgraphs. International Journal of Engineering Inventions 1(5), 60–63 (2012)
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the 2001 IEEE International Conference on Data Mining. IEEE Computer Society (2001)
Yan, X., Han, J.W.: gSpan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining. IEEE Computer Society (2002)
Elseidy, M., Abdelhamid, E., Skiadopoulos, S., Kalnis, P.: GRAMI: frequent subgraph and pattern mining in a single large graph. In: Proceedings of the VLDB Endowment, pp. 517–528 (2014)
Kuramochi, M., Karypis, G.: GREW - a scalable frequent subgraph discovery algorithm. In: Proceedings of ICDM, pp. 439–442 (2004)
Lu, W., et al.: Efficiently extracting frequent subgraphs using mapreduce. In: 2013 IEEE International Conference on Big Data. IEEE (2013)
National Center for Biotechnology Information. PubChem BioAssay Database; AID=2299, Source=Scripps Research Institute Molecular Screening Center (accessed February 22, 2011). http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=2299
Cisco. Cisco global cloud index: forecast and methodology 2013–2018 White Paper. [Online]. Written (2014) (accessed 04/27/2015). http://www.cisco.com/c/en/ussolutions/collateral/serviceprovider/global-cloud-index-gci/CloudIndexWhitePaper.html
Kang, U., Faloutsos, C.: Big graph mining: algorithms and discoveries. SIGKDDD Explorations 14(2), 29–36 (2013)
Bhuiyan, M., Al Hasan, M.: MiRage: An iterative MapReduce based subgraph mining algorithm, July 22, 2013 (accessed 05/31/2015). arXiv:1307.5894
Puolamiki, K., Papapetrou, P., Lijffitj, J.: Visually controllable data mining methods. In: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, pp. 409–417, December 2010
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Abedijaberi, A., Leopold, J. (2016). FSMS: A Frequent Subgraph Mining Algorithm Using Mapping Sets. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_58
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)