PANDA: toward partial topology-based search on large networks in a single machine

Xie, Miao; Bhowmick, Sourav S.; Cong, Gao; Wang, Qing

doi:10.1007/s00778-016-0447-0

PANDA: toward partial topology-based search on large networks in a single machine

Regular Paper
Published: 18 November 2016

Volume 26, pages 203–228, (2017)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Miao Xie^1,2,3,
Sourav S. Bhowmick¹,
Gao Cong¹ &
…
Qing Wang²

1682 Accesses
10 Citations
Explore all metrics

Abstract

A large body of research has focused on efficient and scalable processing of subgraph search queries on large networks. In these efforts, a query is posed in the form of a connected query graph. Unfortunately, in practice end users may not always have precise knowledge about the topological relationships between nodes in a query graph to formulate a connected query. In this paper, we present a novel graph querying paradigm called partial topology-based network search and propose a query processing framework called panda to efficiently process partial topology query (ptq) in a single machine. A ptq is a disconnected query graph containing multiple connected query components. ptqs allow an end user to formulate queries without demanding precise information about the complete topology of a query graph. To this end, we propose an exact and an approximate algorithm called sen-panda and po-panda, respectively, to generate top-k matches of a ptq. We also present a subgraph simulation-based optimization technique to further speedup the processing of ptqs. Using real-life networks with millions of nodes, we experimentally verify that our proposed algorithms are superior to several baseline techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient distributed subgraph similarity matching

Article 07 March 2015

Ye Yuan, Guoren Wang, … Lei Chen

Approximate Subgraph Matching Query over Large Graph

Answering subgraph queries over massive disk resident graphs

Article 25 January 2015

Peng Peng, Lei Zou, … Dongyan Zhao

Notes

As we shall see later, our solution framework can easily handle overlapping cases by mapping it to a Steiner tree problem.
http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/yeast.zip.

References

Bruckner, S., Huffner, F., Karp, R.M., Shamir, R., Sharan, R.: Torque: topology-free querying of protein interaction networks. Nucl. Acids Res. 37(2), 106–108 (2009)
Article Google Scholar
Bruckner, S., Huffner, F., Karp, R.M., Shamir, R., Sharan, R.: Topology-free querying of protein interaction networks. J. Comput. Biol. 17(3), 237–252 (2010)
Article MathSciNet Google Scholar
Buchan, N., Croson, R.: The boundaries of trust: own and others actions in the US and china. J. Econ. Behav. Organ. 55(4), 485–504 (2004)
Article Google Scholar
Cordella, L., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. Pattern Anal. Mach. Intell. IEEE Trans. 26(10), 1367–1372 (2004)
Ding, B., Xu Yu, J., Wang, S., Qin, L., Zhang, X., Lin, X.: Finding top-k min-cost connected trees in databases. In: ICDE, pp. 836–845 (2007)
Duin, C., Volgenant, A., Voß, S.: Solving group steiner problems as steiner problems. Eur. J. Oper. Res. 154(1), 323–329 (2004)
Article MathSciNet MATH Google Scholar
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. VLDB 3(1–2), 264–275 (2010)
Google Scholar
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y.: Adding regular expressions to graph reachability and pattern queries. In: ICDE (2011)
Fan, W., Li, J., Ma, S., Wang, H., Wu, Y.: Graph homomorphism revisited for graph matching. In: PVLDB (2010)
Fernández, M.-L., Valiente, G.: A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognit. Lett. 22(6–7), 753–758 (2001)
Article MATH Google Scholar
Han, W.-S., Lee, J., Lee, J.-H.: TurboISO: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: SIGMOD (2013)
He, H., Wang, H., Yang, J., Yu, P.S.: Blinks: ranked keyword searches on graphs. In: SIGMOD, pp. 305–316 (2007)
Helvig, C.S., Robins, G., Zelikovsky, A.: An improved approximation scheme for the group steiner problem. Networks 37(1), 8–20 (2001)
Article MathSciNet MATH Google Scholar
Henzinger, M.R., Henzinger, T., Kopke, P.: Computing simulations on finite and infinite graphs. In: FOCS (1995)
Ihler, E.: Bounds on the quality of approximate solutions to the group steiner problem. In: Graph-Theoretic Concepts in Computer Science, pp. 109–118 (1991)
Karp, R.M.: Reducibility Among Combinatorial Problems. Springer, Berlin (1972)
Khan, A., Wu, Y., Aggarwal, C.C., Yan, X.: NeMa: fast graph search with label similarity. VLDB 6(3), 181–192 (2013)
Google Scholar
Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection (2014)
Ma, S., Cao, Y., Fan, W., Huai, J., Wo, T.: Strong simulation: Capturing topology in graph pattern matching, vol. 39. In: TODS (2014)
Morsey, M., Lehmann, J., Auer, S., Ngomo, A.-C.N.: DBpedia SPARQL benchmark-performance assessment with real queries on real data. In: ISWC, volume 7031 of LNCS, pp. 454–469. Springer, Berlin (2011)
Pearl, J.: Reverend Bayes on inference engines: a distributed hierarchical approach. In: AAAI (1982)
Pinter, R.Y., Shachnai, H., Zehavi, M.: Partial information network queries. J. Discrete Algorithms 31, 129–145 (2015)
Pinter, R.Y., Shachnai, H., Zehavi, M.: Improved parameterized algorithms for network query problems. In: Parameterized and Exact Computation, pp. 294–306. Springer (2014)
Shang, H., Lin, X., Zhang, Y., Yu, J. X., Wang, W.: Connected substructure similarity search. In: SIGMOD, pp. 903–914 (2010)
Sun, Z., Wang, H., Wang, H., Shao, B., Li, J.: Efficient subgraph matching on billion node graphs. In: PVLDB (2012)
Tian, Y., Patel, J.M.: TALE: a tool for approximate large graph matching. In: ICDE, pp. 963–972 (2008)
Xie, Y., Yu, P.S.: CP-index: on the efficient indexing of large graphs. In: CIKM (2011)
Yang, S., Wu, Y., Sun, H., Yan, X.: Schemaless and structureless graph querying. VLDB 7(7), 565–576 (2014)
Yuan, Y., Wang, G., Xu, J. Y., Chen, L.: Efficient distributed subgraph similarity matching. VLDB J. 24(3), 369–394 (2010)
Zhang, S., Yang, J., Jin, W.: SAPPER: subgraph indexing and approximate matching in large graphs. VLDB 3, 1185–1194 (2010)
Google Scholar
Zeng, Z., Tung, A. K. H., Wang, J., Feng, J., Zhou, L.: Comparing stars: on approximating graph edit distance. In: VLDB (2009)
Zhang, S., Li, S., Yang, J.: GADDI: distance index based subgraph matching in biological networks. In: EDBT (2009)
Zhu, G., Lin, X., Zhu, K., Zhang, W., Yu, J.X.: TreeSpan: efficiently computing similarity all-matching. In: SIGMOD, pp. 529–540 (2012)

Download references

Acknowledgements

Qing Wang is supported by the National Natural Science Foundation of China under grants 61432001, 91318301, 91218302.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
Miao Xie, Sourav S. Bhowmick & Gao Cong
Institute of Software, Chinese Academy of Sciences, Beijing, China
Miao Xie & Qing Wang
CSI Euler Department, Huawei, Beijing, China
Miao Xie

Authors

Miao Xie
View author publications
You can also search for this author in PubMed Google Scholar
Sourav S. Bhowmick
View author publications
You can also search for this author in PubMed Google Scholar
Gao Cong
View author publications
You can also search for this author in PubMed Google Scholar
Qing Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sourav S. Bhowmick.

Additional information

This work was primarily done when the first author was visiting Nanyang Technological University.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 28 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, M., Bhowmick, S.S., Cong, G. et al. PANDA: toward partial topology-based search on large networks in a single machine. The VLDB Journal 26, 203–228 (2017). https://doi.org/10.1007/s00778-016-0447-0

Download citation

Received: 23 March 2016
Revised: 02 September 2016
Accepted: 01 November 2016
Published: 18 November 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s00778-016-0447-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PANDA: toward partial topology-based search on large networks in a single machine

Abstract

Access this article

Similar content being viewed by others

Efficient distributed subgraph similarity matching

Approximate Subgraph Matching Query over Large Graph

Answering subgraph queries over massive disk resident graphs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 28 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PANDA: toward partial topology-based search on large networks in a single machine

Abstract

Access this article

Similar content being viewed by others

Efficient distributed subgraph similarity matching

Approximate Subgraph Matching Query over Large Graph

Answering subgraph queries over massive disk resident graphs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 28 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation