Skip to main content

Advertisement

Log in

Hypergraph querying using structural indexing and layer-related-closure verification

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Graph indexing and querying mechanisms have been receiving significant attention due to their importance in analyzing the growing graph datasets in many domains. Although much work has been done in the context of simple graphs, they are not directly applicable to hypergraphs that represent more complex relationships in various applications. The key problem here is to search a given subhypergraph query in a larger hypergraph dataset. This search problem is known to be NP-hard as it is related to graph isomorphism. To solve this search problem in an efficient manner, we first create an index set by extracting the common subhypergraph structures from the given hypergraph dataset. Upon receiving a query, we use the same indexing techniques and create a query index set from the given subhypergraph. Utilizing both indices, we identify the possible locations of the query in the hypergraph dataset. We then start the subhypergraph search to verify whether the query really appears at each location by using an accelerated verification mechanism called layer-related-closure method. Through experiments on a real hypergraph dataset and random datasets, we demonstrate the efficiency and effectiveness of hypergraph indexing and our verification method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. A triangle-shaped hypergraph can represent various relationships in many other applications as well. For instance, consider a biomedical hypergraph where the different medicines are vertices and the hyperedges are the diseases related to each medicine. One may want to find the three diseases that are mutually having the common medicines for treatment.

  2. We implemented the necessary algorithms for constructing the index and doing the matching and verification in Java. The source code of our programs and more experimental results are available at www.cs.utsa.edu/~korkmaz/research/hypergraph. All experiments are conducted on a Intel Xeon CPU(2.40 GHz) with 24GB RAM.

References

  1. Albert R, Barabási AL (2000) Topology of evolving networks: local events and universality. Phys Rev Lett 85(24):5234–5237

    Article  Google Scholar 

  2. Alon A, Yuster R, Zwick U (1997) Finding and counting given length cycles. Algorithmica 17(3):209–223

    Article  MathSciNet  MATH  Google Scholar 

  3. Berge C (1976) Graphs and hypergraphs, vol 6. Elsevier, Amsterdam

    MATH  Google Scholar 

  4. Brualdi Richard A (1979) The diagonal hypergraph of a matrix (bipartite graph). Discrete Math 27(2):127–147

    Article  MathSciNet  MATH  Google Scholar 

  5. Bunke H, Dickinson P, Kraetzl M (2005) Theoretical and algorithmic framework for hypergraph matching. In: Image analysis and processing—ICIAP 2005, pp 463–470

  6. Bunke H, Dickinson P, Kraetzl M, Neuhaus M, Stettler M (2008) Matching of Hypergraphs—Algorithms, Applications, and Experiments. In: Bunke H, Kandel A, Last M (eds) Applied pattern recognition. Springer, Berlin, pp 131–154

  7. Caldwell AE, Kahng A, Markov IL (1997) Design and implementation of move-based heuristics for VLSI hypergraph partitioning. J Exp Algorithmics (JEA) 5:5

    Article  Google Scholar 

  8. Chu S, Cheng J (2012) Triangle listing in massive networks. ACM Trans Knowl Discov Data (TKDD) 6(4):17

    Google Scholar 

  9. Estrada E, Rodriguez-Velazquez JA (2005) Complex networks as hypergraphs. arXiv preprint physics/0505137

  10. Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. ACM SIGCOMM Comput Commun Rev 29(4):251–262

    Article  Google Scholar 

  11. Giugno R, Shasha D (2002) Graphgrep: a fast and universal method for querying graphs. In: Proceedings of 16th international conference on pattern recognition, IEEE, vol 2, pp 112–115

  12. He H, Singh AK (2006) Closure-tree: An index structure for graph queries. In: Proceedings of the 22nd international conference on data engineering (ICDE), pp 38–38

  13. Horváth T, Bringmann B, De Raedt L (2007) Frequent hypergraph mining. Inductive logic programming. Springer, Berlin

    Google Scholar 

  14. Hwang T, Tian Z, Kuang R, Kocher J-P (2008) Learning on weighted hypergraphs to integrate protein interactions and gene expressions for cancer outcome prediction. In: Eighth IEEE international conference on data mining, 2008 (ICDM’08), pp 293–302

  15. Jiang H, Wang H, Yu PS, Zhou S (2007) Gstring: A novel approach for efficient search in graph databases. In: IEEE 23rd international conference on data engineering, 2007 (ICDE 2007), IEEE, pp 566–575

  16. Kardes H, Gunes MH (2010) Structural graph indexing for mining complex networks. In: EEE 30th international conference on distributed computing systems workshops (ICDCSW), pp 99–104

  17. Karypis G, Aggarwal R, Kumar V, Shekhar S (1997) Multilevel hypergraph partitioning: Application in vlsi domain. In: Proceedings of the 34th annual design automation conference, ACM, pp 526–529

  18. Klamt S, Haus U, Theis F (2009) Hypergraphs and cellular networks. PLoS Comput Biol 5(5):e1000385

    Article  MathSciNet  Google Scholar 

  19. Knoke D, Yang S, Kuklinski JH (2008) Social network analysis. Sage Publications, Los Angeles 2

    Google Scholar 

  20. Konstantinova E, Skorobogatov V (2001) Application of hypergraph theory in chemistry. Discrete Math 235(1):365–383

    Article  MathSciNet  MATH  Google Scholar 

  21. Konstantinova E, Skorobogatov V (1995) Molecular hypergraphs: the new representation of nonclassical molecular structures with polycentric delocalized bonds. J Chem Inf Comput Sci 35(3):472–478

    Article  Google Scholar 

  22. Lin J, Schatz M (2010) Design patterns for efficient graph algorithms in MapReduce. In: Proceedings of the eighth workshop on mining and learning with Graphs, ACM, pp 78–85

  23. Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, Hellerstein J (2012) Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc VLDB Endow 5(8):716–727

    Article  Google Scholar 

  24. Papa DA, Markov IL (2006) Hypergraph partitioning and clustering. Approximation algorithms and metaheuristics. CRC, Boca Ratan

    Google Scholar 

  25. Ramadan E, Tarafdar A, Pothen A (2004) A hypergraph model for the yeast protein complex network. In: Proceedings of the 18th international parallel and distributed processing symposium, pp 189

  26. Schank T, Wagner D (2005) Finding, counting and listing all triangles in large graphs, an experimental study. Experimental and efficient algorithms. Springer, Berlin

    Google Scholar 

  27. Sakr S, Al-Naymat G (2010) Graph indexing and querying: a review. Int J Web Inf Syst 6(2):101–120

    Article  Google Scholar 

  28. Suri S, Vassilvitskii S (2011) Counting triangles and the curse of the last reducer. In: Proceedings of the 20th international conference on world wide web, ACM, pp 607–614

  29. Tian Y, Patel JM (2008) Tale: A tool for approximate large graph matching. In: IEEE 24th international conference on data engineering (ICDE), pp 963–972

  30. Tsourakakis, Charalampos E (2008) Fast counting of triangles in large real networks without counting: Algorithms and laws. In: Eighth IEEE international conference on data mining, 2008 (ICDM’08), pp 608–617

  31. Wang X, Smalter A, Huan J, Lushington GH (2009) G-hash: towards fast kernel-based similarity search in large graph databases. In: Proceedings of the 12th international conference on extending database technology: advances in database technology, pp 472–480

  32. Watts D, Strogatz S (1998) The small world problem. Collect Dyn Small-World Netw 393:440–442

    Google Scholar 

  33. Yan X, Yu PS, Han J (2004) Graph indexing: a frequent structure-based approach. In: Proceedings of the 2004 ACM SIGMOD international conference on management of data, pp 335–346

  34. Zhang S, Li S, Yang J (2009) GADDI: distance index based subgraph matching in biological networks. In: Proceedings of the 12th international conference on extending database technology: advances in database technology, pp 192–203

  35. Zhang S, Yang J, Jin W (2010) SAPPER: subgraph indexing and approximate matching in large graphs. Proc VLDB Endow 3(1–2):1185–1194

    Article  Google Scholar 

  36. Ullmann Julian R (1976) An algorithm for subgraph isomorphism. Journal of the ACM (JACM) 23(1):31–42

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers as well as Andrew Wichmann for their constructive comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinran Yu.

Appendix

Appendix

The steps for generating the adjacency list from the incidence matrix transpose are shown in Algorithm 9.1.

figure e
figure f

For the hypergraph in Fig. 1, this algorithm will generate the adjacency list \(AJL\) as \(\{e1: \langle e_2, e_3 \rangle ; e_2: \langle e_1, e_3 \rangle ; e_3: \langle e_1, e_3 \rangle \}\). For each hyperedge \(e_i\), the algorithm goes over each node \(v_j\) to determine whether \(v_j \in e_i\). If so, the algorithm goes over each other hyperedge \(e_k\) to see whether \(v_j\) belongs to \(e_k\). If that is the case too, \(e_i\) and \(e_k\) are adjacent hyperedges, and thus, we add \(e_k\) to the adjacency list of \(e_i\). Accordingly, the time complexity of the algorithm generating the adjacency list would be \(O(m^2n)\).

One problem we may need to mention here is that if we have adjacency list \(\{e1: e2, e3; e2: e1, e3; e3: e1, e3\}\) as in Fig. 1, HITE algorithm in Sect. 5 would extract the same hypertriangle three times as it goes through all of the hyperedges’ possible neighbors. To eliminate such duplicates of the checking, we can modify the HALG algorithm only on the third “for” loop (form \(1 \le k\) to \(i \le k\)) as shown in Algorithm 9.2 so that HITE algorithm would not generate the same hypertriangles.

Now the adjacency list for Fig. 1 will be \(\{e_1: \langle e_2, e_3\rangle ; e_2: \langle e_3\rangle ; e_3:\langle \rangle \}\). Although we do not keep a full set of all the related hyperedges to each hyperedge, this list is easy to generate and allow HITE algorithm to visit only one of the same hypertriangles.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, X., Korkmaz, T. Hypergraph querying using structural indexing and layer-related-closure verification. Knowl Inf Syst 46, 537–565 (2016). https://doi.org/10.1007/s10115-015-0829-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-015-0829-4

Keywords