ABSTRACT
In many scientific and commercial domains, graph as a data structure has become increasingly important for modeling of sophisticated structures. In the past few years, there has been sharp increase in research on mining graph data. We had proposed a unified framework for graph mining and analysis of extracted substructures, which was then an unattended task. Pruthak, a graph mining tool is developed based on this proposed framework. The tool provides preprocessing, frequent substructure discovery, dense substructure extraction and visualization techniques for graph representation of data. In this paper we discuss the approach taken in design and implementation of Pruthak. We then talk about our study on the Digital Bibliography & Library Project (DBLP) dataset for mining and analyzing substructures using this tool. The study results have demonstrated the intended correctness and usability of the tool.
- R. Agrawal, and R. Srikant, "Fast algorithms for mining association rules", In Proceedings of 1994 International Conference Very Large Data Bases (VLDB'94), pp. 487--499, Santiago, Chile, Sept. 1994. Google ScholarDigital Library
- D. Auber. Tulip. In P. Mutzel, M. Jünger, and S. Leipert, editors, 9th Symp. Graph Drawing, volume 2265 of Lecture Notes in Computer Science, pages 335--337. Springer-Verlag, 2001.Google Scholar
- V. Batagelj and A. Mrvar. Pajek---analysis and visualization of large networks. In M. Jünger and P. Mutzel, eds. Graph Drawing Software, pp. 77--103. Springer, 2003.Google Scholar
- G. D. Battista, W. Didimo, M. Patrignani, M. Pizzonia "Drawing Database Schemas with DBdraw", Graph Drawing 2001: 451--452Google Scholar
- I. M. Bomze, M. Budinich, P. M. Pardalos, and M. Pelillo, "The maximum clique problem", Handbook of Combinatorial Optimization, 1999.Google ScholarCross Ref
- U. Brandes and D. Wagner. Visone---analysis and visualization of social networks. In M. Jünger and P. Mutzel, eds. Graph Drawing Software, pp. 321--340. Springer, Berlin, 2003. Google ScholarDigital Library
- U. Brandes, P. Kenis, and D. Wagner. "Communicating centrality in policy network drawings", IEEE Transactions on Visualization and Computer Graphics, 9(2):241--253, 2003. Google ScholarDigital Library
- M. Chimani, C. Gutwenger, M. Jünger, K. Klein, P. Mutzel, M. Schulz. "The Open Graph Drawing Framework.", 15th International Symposium on Graph Drawing 2007, Sydney (GD07).Google Scholar
- D. Cook, and L. Holder, "Mining graph data", Wiley Publication, 2007 Google ScholarDigital Library
- C. Faloutsos, K. S. McCurley, and A. Tomkins, "Fast discovery of connection subgraphs", In Proceedings of 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 118--127, 2004. Google ScholarDigital Library
- D. Gibson, R. Kumar, and A. Tomkins, "Extracting large dense subgraphs in massive graphs", In Proceedings of 31st International Conference on Very Large Data Bases, 2005. Google ScholarDigital Library
- J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation", In Proceedings of 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD'00), pp. 1--12, Dallas, TX, May 2000. Google ScholarDigital Library
- J. Huan, W. Wang, J. Prins, and J. Yang, "SPIN: Mining maximal frequent subgraphs from graph databases", In Proceedings of 2004 ACM SIGKDD International Conference on Knowledge Discovery in Databases (KDD'04), pp. 581--586, Seattle, WA, Aug. 2004. Google ScholarDigital Library
- A. Inokuchi, T. Washio, and H. Motoda, "An apriori-based algorithm for mining frequent substructures from graph data", In Proceedings of 2000 European Symposium Principle of Data Mining and Knowledge Discovery (PKDD'00), pp. 13--23, Lyon, France, Sept. 2000. Google ScholarDigital Library
- P. Joshi and R. Joshi, "Concept based class Cohesion Analysis," IEEE proceedings of 13th European Conference on Software Maintenance and Reverse Engineering (CSMR), 2009, Kaiserslautern, Germany, March 24--27, 2009, pp. 237--241. Google ScholarDigital Library
- R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, "Trawling the web for emerging cyber-communities", WWW/Computer Networks, 31(11--16):1481--1493, 1999. Google ScholarDigital Library
- M. Kuramochi, and G. Karypis, "Frequent subgraph discovery", In Proceedings of 2001 International Conference on Data Mining (ICDM'01), pp. 313--320, San Jose, CA, Nov. 2001. Google ScholarDigital Library
- M. Ley, "DBLP --- Some Lessons Learned", VLDB '09, August 24--28, 2009, Lyon, France Google ScholarDigital Library
- J. Madaadhain, D. Fisher, P. Smyth, S. White, and Y. B. Boey, "Analysis and Visualization of Network Data Using JUNG", J. Statistical Software.Google Scholar
- J. F. Rodrigues Jr, H. Tong, Agma J. M. Traina, C. Faloutsos, and J. Leskovec, "GMine: a system for scalable, interactive graph visualization and mining", Proceedings of the 32nd international conference on Very large data bases, September 12--15, 2006, Seoul, Korea. Google ScholarDigital Library
- K. Sugiyama, S. Tagawa, and M. Toda. "Methods for visual understanding of hierarchical systems", IEEE Transactions on Systems, Man, and Cybernetics, SMC-11(2):109125, 1981.Google ScholarCross Ref
- S. Shrivastava, "Finding topical clusters by mining maximum clique / near clique in cocitation graph", MS thesis, BITS Pilani, India, 2007.Google Scholar
- S. Shrivastava, and S. N. Pal, "Graph mining framework for finding and visualizing substructures using graph database.", Advances in Social Network Analysis and Mining (ASONAM), Athens, Greece, Jul 2009, pp. 379--380. Google ScholarDigital Library
- Shrivastava, S; Singh P; Kulshrestha K; and Pal S N, "Informative Graph Visualization for Graph Mining and Code Refactoring Applications", IEEE Pacific Visualization Symposium, March 2010, poster presentation.Google Scholar
- R. Tamassia. "Advances in the theory and practice of graph drawing", Theoretical Computer Science, 17:235--254, 1999. Google ScholarDigital Library
- R. Tamassia, G. Di Battista, and C. Batini. "Automatic graph drawing and readability of diagrams", IEEE Transactions on Systems, Man, and Cybernetics, SMC-18(1):6179, 1988. Google ScholarDigital Library
- X. Yan, and J. Han, "gSpan: Graph-based substructure pattern mining", In Proceedings of 2002 International Conference on Data Mining (ICDM'02), pp. 721--724, Maebashi, Japan, Dec. 2002. Google ScholarDigital Library
- http://dblp.uni-trier.de/xml/Google Scholar
- http://dblpVis.uni-trier.deGoogle Scholar
- http://www.jfree.org/jfreechart/Google Scholar
- http://jgrapht.sourceforge.net/Google Scholar
- http://jung.sourceforge.net/Google Scholar
- http://gvf.sourceforge.net/Google Scholar
- http://sourceforge.net/projects/sonivis/Google Scholar
- http://toscanaj.sourceforge.net/Google Scholar
- http://www.aisee.com/Google Scholar
- http://www.babelgraph.org/Google Scholar
- http://www.cs.waikato.ac.nz/~ml/weka/Google Scholar
- http://www.graphviz.org/Google Scholar
- http://www.graph-magics.com/Google Scholar
- http://www.jgraph.com/pub/jgraphmanual.pdfGoogle Scholar
- http://www.jgraph.com/mxgraph.htmlGoogle Scholar
- http://www.jgraph.com/layout.htmlGoogle Scholar
- http://www.oreas.com/libraries_en.phpGoogle Scholar
- http://www.tomsawyer.com/products/index.phpGoogle Scholar
- http://www.yworks.com/products/yfiles/doc/developers-guide/index.htmlGoogle Scholar
Index Terms
- Pruthak: mining and analyzing graph substructures
Recommendations
Graph Mining Framework for Finding and Visualizing Substructures Using Graph Database
ASONAM '09: Proceedings of the 2009 International Conference on Advances in Social Network Analysis and MiningIn the scientific and commercial domains, graph as a data structure has become increasingly important for modeling sophisticated structures especially the interactions within them. Mining the knowledge from graph data has become a major research topic ...
Finding the most descriptive substructures in graphs with discrete and numeric labels
NFMCP'12: Proceedings of the First International Conference on New Frontiers in Mining Complex PatternsMany graph datasets are labelled with discrete and numeric attributes. Frequent substructure discovery algorithms usually ignore numeric attributes; in this paper we show that they can be used to improve discrimination and search performance. Our thesis ...
Frequent Pattern-based Graph Exploration
VINCI '19: Proceedings of the 12th International Symposium on Visual Information Communication and InteractionVisual graph exploration can help the users have an intuitive impression about a dataset for the first time. However, it is better to provide some guidance so that the users can quickly locate the "interesting" or "informative" area in the graph. Then ...
Comments