Skip to main content
Log in

Efficient computation of expected motif frequency in uncertain graphs by exploiting possible world marginalization and motif transition

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Finding building blocks buried in real-world networks is an important task not only in network science but also in biology, chemistry, sociology, and other fields. Many attempts have been conducted to efficiently search for these building blocks under different settings. In this study, we take up the challenge of counting motifs from uncertain graphs, i.e., computing the expected frequency of motifs. In general, analysis for uncertain graphs is computationally expensive even with a small graph because there exists a vast number of possible worlds and a large number of sample graphs are needed to accurately cover the possible worlds. To alleviate such inefficiency coming from sampling which many existing studies rely on, we propose an analytical computation method that gives an exact expected frequency and is not based on costly sampling. The key idea of our method is to marginalize the probability of each possible world on a candidate motif, which can drastically reduce the number of possible worlds. We introduce matrices on the number of transition patterns and the transition probabilities among motifs to achieve further acceleration if the edge-existence probability is uniform and constant. We conduct experimental evaluations in the task of computing the frequency significance of directed 3- and 4-node and undirected 4-node motifs under both the uniform and non-uniform probability settings. The results confirm that the proposed method is effective and efficient. The sampling-based state-of-the-art method needs 2–4 orders of magnitude more time than the proposed method to achieve the same accuracy. In addition, the accelerated version of our method can achieve further acceleration, i.e., it runs about 1 order of magnitude faster than the general version of ours under the uniform probability setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data will be made available on reasonable request.

Notes

  1. When the target graph is clear from the context, it is written as \(T_m\).

  2. The numbers assigned to the directed 3-node motifs derive from the work (Milo et al. 2002).

  3. We show only the topmost left and the bottommost right parts of the transition coefficient matrix due to the large number of motif patterns, e.g., \(M=199\) for \(k=4\), \(M=9378\) for \(k=5\).

  4. https://ameblo.jp/.

  5. https://www.cosme.net/.

  6. https://www.cs.cornell.edu/projects/kddcup/.

  7. https://slashdot.org/.

  8. https://blog.goo.ne.jp/.

  9. https://www.openstreetmap.org/.

References

  • Ahmed NK, Neville J, Rossi RA, Duffield N (2015) Efficient graphlet counting for large networks. In: 2015 IEEE international conference on data mining, pp 1–10 https://doi.org/10.1109/ICDM.2015.141

  • Boekhout H, Kosters W, Takes F (2019) Efficiently counting complex multilayer temporal motifs in large-scale networks. Comput Soc Netw https://doi.org/10.1186/s40649-019-0068-z

  • Ceccarello M, Fantozzi C, Pietracaprina A, Pucci G, Vandin F (2017) Clustering uncertain graphs. Proc VLDB Endow 11(4):472–484

    Article  Google Scholar 

  • Fushimi T, Saito K, Motoda H (2021) Efficient analytical computation of expected frequency of motifs of small size by marginalization in uncertain network. In: Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM’21, 1–8. Association for computing machinery, New York, NY, USA. https://doi.org/10.1145/3487351.3488275

  • Grochow JA, Kellis M (2007) Network motif discovery using subgraph enumeration and symmetry-breaking. In: Proceedings of the 11th annual international conference on research in computational molecular biology, RECOMB’07. Springer, Berlin, pp 92–106

  • Hocevar T, Demšar J (2017) Combinatorial algorithm for counting small induced graphs and orbits. PLoS ONE 12(2):1–17. https://doi.org/10.1371/journal.pone.0171428

    Article  Google Scholar 

  • Itzhack R, Mogilevski Y, Louzoun Y (2007) An optimal algorithm for counting network motifs. Physica A 381:482–490. https://doi.org/10.1016/j.physa.2007.02.102

    Article  Google Scholar 

  • Jin R, Liu L, Aggarwal CC (2011a) Discovering highly reliable subgraphs in uncertain graphs. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’11. ACM, New York, NY, USA, pp 992–1000

  • Jin R, Liu L, Ding B, Wang H (2011b) Distance-constraint reachability computation in uncertain graphs. Proc VLDB Endow 4(9):551–562

    Article  Google Scholar 

  • Kaluza P, Vingron M, Mikhailov AS (2008) Self-correcting networks: function, robustness, and motif distributions in biological signal processing. Chaos Interdiscipl J Nonlinear Sci 18(02):026113:1-026113:17

    MathSciNet  Google Scholar 

  • Khan A, Ye Y, Chen L, Jagadish HV (2018) On uncertain graphs. Morgan & Claypool, San Rafael

    Book  Google Scholar 

  • Kim J, Li ML, Candan K, Sapino M (2017) Personalized PageRank in uncertain graphs with mutually exclusive edges. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, SIGIR’17. ACM, New York, NY, USA, pp 525–534. https://doi.org/10.1145/3077136.3080794

  • Latora V, Nicosia V, Russo G (2017) Complex networks: principles, methods and applications, 1st edn. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Leskovec J, Krevl A (2014) SNAP datasets: stanford large network dataset collection http://snap.stanford.edu/data

  • Li X, Stones RJ, Wang H, Deng H, Liu X, Wang G (2012) Netmode: network motif detection without nauty. PLoS ONE 7(12):1–9. https://doi.org/10.1371/journal.pone.0050093

    Article  Google Scholar 

  • Liu L, Jin R, Aggarwal C, Shen Y (2012) Reliable clustering on uncertain graphs. In: 2012 IEEE 12th international conference on data mining, pp 459–468

  • Liu P, Benson AR, Charikar M (2019) Sampling methods for counting temporal motifs. In: Proceedings of the twelfth ACM international conference on web search and data mining, WSDM’19. Association for Computing Machinery, New York, NY, USA, pp 294–302. https://doi.org/10.1145/3289600.3290988

  • Ma C, Cheng R, Lakshmanan LVS, Grubenmann T, Fang Y, Li X (2019) Linc: a motif counting algorithm for uncertain graphs. Proc VLDB Endow 13(2), 155–168. https://doi.org/10.14778/3364324.3364330

  • Marcus D, Shavitt Y (2012) Rage—a rapid graphlet enumerator for large networks. Comput Netw 56(2):810–819. https://doi.org/10.1016/j.comnet.2011.08.019

    Article  Google Scholar 

  • Marinka Zitnik Rok Sosic SM, Leskovec J (2018) BioSNAP datasets: stanford biomedical network dataset collection. http://snap.stanford.edu/biodata

  • Maslov S, Sneppen K (2002) Specificity and stability in topology of protein networks. Science 296(5569):910–913. https://doi.org/10.1126/science.1065103

    Article  Google Scholar 

  • Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science (New York, NY) 298(5594):824–827. https://doi.org/10.1126/science.298.5594.824

    Article  Google Scholar 

  • Mukherjee AP, Xu P, Tirthapura S (2015) Mining maximal cliques from an uncertain graph. In: 2015 IEEE 31st international conference on data engineering, pp 243–254. https://doi.org/10.1109/ICDE.2015.7113288

  • Newman MEJ (2003) The structure and function of complex networks. SIAM Rev 45:167–256

    Article  MathSciNet  MATH  Google Scholar 

  • Pfeiffer JJ, Neville J (2011) Methods to determine node centrality and clustering in graphs with uncertain structure. In: Proceedings of the fifth international conference on weblogs and social media. The AAAI Press, pp 590–593

  • Pinar A, Seshadhri C, Vishal V (2017) Escape: efficiently counting all 5-vertex subgraphs. In: Proceedings of the 26th international conference on world wide web, WWW ’17, 1431–1440. International world wide web conferences steering committee, Republic and Canton of Geneva, CHE. https://doi.org/10.1145/3038912.3052597

  • Potamias M, Bonchi F, Gionis A, Kollios G (2010) K-nearest neighbors in uncertain graphs. Proc VLDB Endow 3(1–2):997–1008

    Article  Google Scholar 

  • Ren Y, Sarkar A, Kahveci T (2018) Promote: an efficient algorithm for counting independent motifs in uncertain network topologies. BMC Bioinform. https://doi.org/10.1186/s12859-018-2236-9

    Article  Google Scholar 

  • Ribeiro P, Silva F (2010) G-tries: an efficient data structure for discovering network motifs. SAC’10, 1559–1566. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1774088.1774422

  • Sarpe I, Vandin F (2021) OdeN: simultaneous approximation of multiple motif counts in large temporal networks, pp 1568–1577. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3459637.3482459

  • Todor A, Dobra A, Kahveci T (2015) Counting motifs in probabilistic biological networks. In: Proceedings of the 6th ACM conference on bioinformatics, computational biology and health informatics, BCB’15, 116–125. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2808719.2808731

  • Tran N, Choi KP, Zhang L (2013) Counting motifs in the human interactome. Nat Commun 4:2241. https://doi.org/10.1038/ncomms3241

    Article  Google Scholar 

  • Wernicke S (2005) A faster algorithm for detecting network motifs. In: Proceedings of the 5th international conference on algorithms in bioinformatics, WABI’05. Springer, Berlin, Heidelberg, pp 165–177. https://doi.org/10.1007/11557067_14

  • Zou Z, Li J, Gao H, Zhang S (2010) Mining frequent subgraph patterns from uncertain graph data. IEEE Trans Knowl Data Eng 22(9):1203–1218. https://doi.org/10.1109/TKDE.2010.80

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by JSPS KAKENHI Grant Numbers 20K11940.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takayasu Fushimi.

Ethics declarations

Competing interests

All authors declare no financial and non-financial competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fushimi, T., Saito, K. & Motoda, H. Efficient computation of expected motif frequency in uncertain graphs by exploiting possible world marginalization and motif transition. Soc. Netw. Anal. Min. 12, 126 (2022). https://doi.org/10.1007/s13278-022-00956-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-022-00956-y

Navigation