ABSTRACT
Several visual graph query interfaces (a.k.a gui) expose a set of canned patterns (i.e., small subgraph patterns) to expedite subgraph query formulation by enabling pattern-at-a-time construction. Unfortunately, manual generation of canned patterns is not only labour intensive but also may lack diversity to support efficient visual formulation of a wide range of subgraph queries. Recent efforts have taken a data-driven approach to select high-quality canned patterns for a gui automatically from the underlying graph database. However, as the underlying database evolves, these selected patterns may become stale and adversely impact efficient query formulation. In this paper, we present a novel framework called Midas for efficient and effective maintenance of the canned patterns as the database evolves. Specifically, it adopts a selective maintenance strategy that guarantees progressive gain of coverage of the patterns without sacrificing their diversity and cognitive load. Experimental study with real-world datasets and visual graph interfaces demonstrates the effectiveness of Midas compared to static guis.
Supplemental Material
- textscaids dataset. https://wiki.nci.nih.gov/display/NCIDTPdata/AIDSGoogle Scholar
- AntiviralGoogle Scholar
- ScreenGoogle Scholar
- Data.Google Scholar
- rugBank interface. https://go.drugbank.com/structures/search/small_molecule_drugs/structure.Google Scholar
- Molecules dataset. https://www.emolecules.com/info/plus/download-database.Google Scholar
- ubChem dataset. ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/CURRENT-Full/SDF/.Google Scholar
- ubChem interface. https://pubchem.ncbi.nlm.nih.gov//edit3/index.html.Google Scholar
- . Abdelhamid, M. Canim, M. Sadoghi, B. Bhattacharjee, Y.C. Chang, P. Kalnis. Incremental frequent subgraph mining on large evolving graphs. IEEE T. Knowl. Data En. 29(12):2710--2723, 2017.Google ScholarCross Ref
- . Arora, E. Hazan, S. Kale. The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 2012.Google Scholar
- . Arthur, S. Vassilvitskii. K-meansGoogle Scholar
- : The advantages of careful seeding. In SIAM, 2007.Google Scholar
- .L. Balcázar, A. Bifet, A. Lozano. Mining frequent closed rooted trees. Machine Learning, 78(1--2):1, 2010.Google Scholar
- . Bifet, R. Gavald$gravea$. Mining adaptively frequent closed unlabeled rooted trees in data streams. In SIGKDD, 2008.Google ScholarDigital Library
- . Bifet, R. Gavald$gravea$. Mining frequent closed trees in evolving data streams. Intell. Data Anal., 15(1):29--48, 2011.Google ScholarDigital Library
- .S. Bhowmick, B. Choi, C.E. Dyreson. Data-driven visual graph query interface construction and maintenance: challenges and opportunities. PVLDB, 9(12):984--992, 2016.Google ScholarDigital Library
- . S. Bhowmick, K. Huang, et al. AURORA: data-driven construction of visual graph query interfaces for graph databases. In SIGMOD, 2020.Google ScholarDigital Library
- . Bonnici, A. Ferro, R. Giugno, A. Pulvirenti, D. Shasha. Enhancing graph database indexing by suffix tree structure. In IAPR PRIB, 2010.Google ScholarCross Ref
- J. Cheng, Y. Ke, W. Ng, et al. Fg-index: towards verification-free query processing on graph databases. In SIGMOD, 2007.Google ScholarDigital Library
- . Cheung, O.R. Zaiane. Incremental mining of frequent patterns without candidate generation or support constraint. In IDEAS, 2003.Google Scholar
- .P. Cordella, P. Foggia, et al. A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell., 26(10):1367--1372, 2004.Google ScholarDigital Library
- . Fan, J. Li, S. Ma, N. Tang, Y. Wu and Y. Wu. Graph pattern matching: from intractability to polynomial time. In PVLDB, 2010.Google ScholarDigital Library
- . Fan, J. Li, J. Luo, Z. Tan, X. Wang, Y. Wu. Incremental graph pattern matching. In SIGMOD, 2011.Google ScholarDigital Library
- . Fan, C. Hu, C. Tian. Incremental graph computations: doable and undoable. In SIGMOD, 2017.Google ScholarDigital Library
- .L. Faulkner. Beyond the five-user assumption: Benefits of increased sample sizes in usability testing. Behavior Research Methods, Instruments, & Computers, 35(3), 2003.Google Scholar
- . He, A.K. Singh. Closure-tree: An index structure for graph queries. In ICDE, 2006.Google Scholar
- . Huang, H.E. Chua, S.S. Bhowmick, B. Choi, S. Zhou. CATAPULT: data-driven selection of canned patterns for efficient visual graph query formulation. In SIGMOD, 2019.Google Scholar
- . Huang, H.E. Chua, S.S. Bhowmick, B. Choi, S. Zhou. MIDAS: towards efficient and effective maintenance of canned patterns in visual graph query interfaces. Technical Report. Available at: https://github.com/MIDAS2020/Midas/blob/master/sigmod-midas_TR.pdf.Google Scholar
- . Huang, P. Eades, S.H. Hong. Measuring effectiveness of graph visualizations: A cognitive load perspective. Inf. Vis., 8(3):139--152, 2009.Google ScholarDigital Library
- . Lazar, J.H. Feng, H. Hochheiser. Research methods in human-computer interaction. John Wiley & Sons, 2010.Google ScholarDigital Library
- . Li, M. Semerci, B. Yener, M.J. Zaki. Graph classification via topological and label attributes. In MLG, 2011.Google Scholar
- .J. Llanos, J. Leal, W. Luu, D.H. Jost, P.F., Stadler, G. Restrepo. Exploration of the chemical space and its three historical regimes. PNAS, 116(26):12660--12665, 2019.Google ScholarCross Ref
- .T. Marler, J.S. Arora. The weighted sum method for multi-objective optimization: new insights. Struct. Multidiscip. O., 41(6):853--862, 2010.Google ScholarCross Ref
- . Morina. The trie data structure in Java. Available at https://www.baeldung.com/trie-java. Accessed on 30 September 2019.Google Scholar
- . Prvzulj. Biological network comparison using graphlet degree distribution. Bioinformatics, 23(2):e177-e183, 2007.Google ScholarDigital Library
- . Riesen, M. Neuhaus, H. Bunke. Bipartite graph matching for computing the edit distance of graphs. In GbRPR, 2007.Google ScholarDigital Library
- . Saha, L. Getoor. On maximum coverage in the streaming model & application to multi-topic blog-watch. In SDM, 2009.Google ScholarCross Ref
- . Saha. An incremental bisimulation algorithm. In FSTTCS, 2007.Google ScholarDigital Library
- . Shang, X. Lin, Y. Zhang, J.X. Yu, W. Wang. Connected substructure similarity search. In SIGMOD, 2010.Google ScholarDigital Library
- . Shneiderman, C. Plaisant. Desigining the user interface: strategies for effective human-computer interaction. 5th Ed., Addison-Wesley, 2010.Google Scholar
- . Tofallis. Add or multiply? A tutorial on ranking and choosing with multiple criteria. INFORMS Trans. on Education, 14(3): 109--119, 2014.Google ScholarCross Ref
- . Wang, W. Hsu, M.L. Lee, C. Sheng. A partition-based approach to graph mining. In ICDE, 2006.Google Scholar
- . Yan, P. S. Yu, J. Han. Graph indexing: a frequent structure-based approach. In SIGMOD, 2004.Google ScholarDigital Library
- . Yang, A.W.C. Fu, R. Liu. Diversified top-k subgraph querying in a large graph. In SIGMOD, 2016.Google ScholarDigital Library
- . Yoghourdjian, D. W. Archambault, et al.Exploring the limits of complexity: A survey of empirical studies on graph visualisation. Visual Informatics ,2(4): 264--282, 2018.Google ScholarCross Ref
- . Yuan, P. Mitra, H. Yu, C.L. Giles. Iterative graph feature mining for graph indexing. In ICDE, 2012.Google ScholarDigital Library
- . Yuan, P. Mitra, H. Yu, C.L. Giles. Updating graph indices with a one-pass algorithm. In SIGMOD, 2015.Google ScholarDigital Library
- . Zhang, S.S. Bhowmick, H.H. Nguyen, B. Choi, F. Zhu. DaVinci: Data-driven visual interface construction for subgraph search in graph databases. In ICDE, 2015.Google ScholarCross Ref
- . Zhang, M. Hu, J. Yang. Treepi: A novel graph indexing method. In ICDE, 2007.Google ScholarCross Ref
- . Zhao, J.X. Yu, P.S. Yu. Graph indexing: treeGoogle Scholar
- delta= graph. In VLDB, 2007.Google Scholar
- . Zou, L. Chen, J.X. Yu, Y. Lu. A novel spectral coding in a large graph database. In EDBT, 181--192, 2008.Google ScholarDigital Library
Index Terms
- MIDAS: Towards Efficient and Effective Maintenance of Canned Patterns in Visual Graph Query Interfaces
Recommendations
A Highly Modular Architecture for Canned Pattern Selection Problem
Database and Expert Systems ApplicationsAbstractDue to the continuously increasing rate of data production from multiple sources, especially from social media, data analysis techniques are focusing on identifying patterns in the formed graphs and extracting knowledge from them. Most techniques ...
Graph Querying Meets HCI: State of the Art and Future Directions
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of DataQuerying graph databases has emerged as an important research problem for real-world applications that center on large graph data. Given the syntactic complexity of graph query languages (e.g., SPARQL, Cypher), visual graph query interfaces make it easy ...
Data-driven Visual Query Interfaces for Graphs: Past, Present, and (Near) Future
SIGMOD '22: Proceedings of the 2022 International Conference on Management of DataVisual graph query interfaces (VQI) widen the reach of graph querying frameworks across a variety of end users by enabling non-programmers to use them. Several industrial and academic frameworks for querying graphs expose such visual interfaces. In this ...
Comments