Skip to main content
Log in

Detecting multiple stochastic network motifs in network data

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Network motifs are referred to as the interaction patterns that occur significantly more often in a complex network than in the corresponding randomized networks. They have been found effective in characterizing many real-world networks. A number of network motif detection algorithms have been proposed in the literature where the interactions in a motif are mostly assumed to be deterministic, i.e., either present or missing. With the conjecture that the real-world networks are resulted from interaction patterns which should be stochastic in nature, the use of stochastic models is proposed in this paper to achieve more robust motif detection. In particular, we propose the use of a finite mixture model to detect multiple stochastic network motifs. A component-wise expectation maximization (CEM) algorithm is derived for the finite mixture of stochastic network motifs so that both the optimal number of motifs and the motif parameters can be automatically estimated. For performance evaluation, we applied the proposed algorithm to both synthetic networks and a number of online social network data sets and demonstrated that it outperformed the deterministic motif detection algorithm FANMOD as well as the conventional EM algorithm in term of its robustness against noise. Also, how to interpret the detected stochastic network motifs to gain insights on the interaction patterns embedded in the network data is discussed. In addition, the algorithm’s computational complexity and runtime performance are presented for efficiency evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. http://www.weizmann.ac.il/mcb/urialon/groupnetworkmotifsw.html.

  2. http://theinf1.informatik.uni-jena.de/~wernicke/motifs/index.html.

  3. The formulation presented in this section is based on directed graphs, but can be easily modified for undirected graphs as well.

  4. The relative errors \(e_{\lambda }\) and \(e_{\Theta }\) for the mixture models inferred based on different Z-score thresholds for initialization were calculated by comparing the model parameters \(\lambda \) and \(\Theta \) with the reference case of having the Z-score threshold set to 2.

References

  1. Newman M (2003) The structure and function of complex networks. SIAM Rev 45(2):167–256

    Article  MathSciNet  MATH  Google Scholar 

  2. Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  Google Scholar 

  3. Kleinberg J (1999) Hubs, authorities, and communities. ACM Comput Surv 31(4):5–7

    Article  Google Scholar 

  4. Leskovec J, Lang K, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World Wide Web, pp 631–640

  5. Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P (2012) Community detection in social media. Data Min Knowl Discov 24(3):515–554

    Article  Google Scholar 

  6. Comar P, Tan PN, Jain AK (2012) Simultaneous classification and community detection on heterogeneous network data. Data Min Knowl Discov 25(3):420–449

    Article  MathSciNet  MATH  Google Scholar 

  7. Shen-Orr S, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet 31(1):64–68

    Article  Google Scholar 

  8. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827

    Article  Google Scholar 

  9. Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1541

    Article  Google Scholar 

  10. Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746–1758

    Article  Google Scholar 

  11. Wernicke S (2006) Efficient detection of network motifs. IEEE/ACM Trans Comput Biol Bioinform 3(4):347–359

    Article  Google Scholar 

  12. Berg J, Michael L (2004) Local graph alignment and motif search in biological networks. Proc Natl Acad Sci 101(41):14689–14694

    Article  Google Scholar 

  13. Jiang R, Tu Z, Chen T, Sun F (2006) Network motif identification in stochastic networks. Proc Natl Acad Sci 103(25):9404–9409

    Article  Google Scholar 

  14. Zhao Q, Tian Y, He Q, Oliver N, Jin R, Lee W (2010) Communication motifs: A tool to characterize social communications. In: Proccedings of the 19th ACM international conference on information and, knowledge management, pp 1645–1648

  15. Kovanen L, Karsai M, Kaski K, Kertész J, Saramäki J (2011) Temporal motifs in time-dependent networks. J Stat Mech Theory Exp 11(11):5–23

    Google Scholar 

  16. Juszczyszyn K, Kazienko P, Musial K (2008) Local topology of social network based on motif analysis. In: Proceedings of knowledge-based intelligent information and engineering systems, pp 97–105

  17. Musial K, Juszczyszyn K (2009) Motif-based analysis of social position influence on interconnection patterns in complex social network. In: Procceedings of the 1st Asian conference on intelligent information and database systems, pp 34–39

  18. Braha D, Bar-Yam Y (2009) Time-dependent complex networks: dynamic centrality, dynamic motifs, and cycles of social interactions. Theory, models and applications, adaptive networks, pp 39–50

  19. Leskovec J, Huttenlocher D, Kleinberg J (2010) Signed networks in social media. In: Proceedings of the 28th international conference on human factors in computing systems, pp 1361–1370

  20. Clauset A (2005) Finding local community structure in networks. Phys Rev E 72(2):132–137

    Article  Google Scholar 

  21. Faust K (2007) Very local structure in social networks. Sociol Method 37(1):209–256

    Article  Google Scholar 

  22. Kossinets G, Kleinberg J, Watts D (2008) The structure of information pathways in a social communication network. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 435–443

  23. Birmele E (2012) Detecting local network motifs. Electron J Stat 6(1):908–933

    Article  MathSciNet  MATH  Google Scholar 

  24. Gallos LK, Diego R, Fredrik L, Shlomo H, Hernán MA (2012) How people interact in evolving online affiliation networks. Phys Rev X 2(3):14–31

    Google Scholar 

  25. Allan E, Turkett W, Fulp E (2009) Using network motifs to identify application protocols. In: Global telecommunications conference, pp 1–7

  26. Callaghan D, Harrigan M, Carthy J, Cunningham P (2012) Network analysis of recurring YouTube spam campaigns. In: Proceedings of the 6th international AAAI conference on weblogs and social media, pp 531–534

  27. Liu K, Cheung WK, Liu J (2011) Stochastic network motif detection in social media. In: Proccedings of ICDM workshop on data mining in networks, pp 949–956

  28. Liu K, Cheung W K, Liu J (2012) Detecting multiple stochastic network motifs in network data. In: Proceedings of the 16th Pacific–Asia conference on knowledge discovery and data mining, pp 205–217

  29. Figueiredo M, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396

    Article  Google Scholar 

  30. Bruno F, Palopoli L, Rombo S (2010) New trends in graph mining: structural and node-colored network motifs. Int J Knowl Discov Bioinform 1(1):81–99

    Article  Google Scholar 

  31. Wong E, Baur B, Quader S, Huang C (2012) Biological network motif detection: principles and practice. Briefings Bioinform 12(2):202–215

    Article  Google Scholar 

  32. Schreiber F, Schwöbbermeyer H (2005) Frequency concepts and pattern detection for the analysis of motifs in networks. Transactions on computational systems biology, pp 89–104

  33. Schreiber F, Schwöbbermeyer H (2005) MAVisto: a tool for the exploration of network motifs. Bioinformatics 21(17):3572–3574

    Article  Google Scholar 

  34. Chen J, Hsu W, Lee M, Ng S (2006) Nemofinder: dissecting genome-wide protein–protein interactions with meso-scale network motifs. In: Proccedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 106–115

  35. Kashani Z, Ahrabian H, Elahi E et al (2009) Kavosh: a new algorithm for finding network motifs. BMC Bioinform 10(1):318–329

    Article  Google Scholar 

  36. Grochow J, Kellis M (2007) Network motif discovery using subgraph enumeration and symmetry-breaking. Research in computational, molecular biology, pp 92–106

  37. Omidi S, Schreiber F, Masoudi-Nejad A (2009) Moda: an efficient algorithm for network motif discovery in biological networks. Genes Genet Syst 84(5):385–395

    Article  Google Scholar 

  38. Fortin S (1996) The graph isomorphism problem. Technical Report 96–20, University of Alberta, Edomonton, Alberta, Canada

  39. McKay B (2007) Nauty user’s guide (version 2.4). Computer Science Dept, Australian National University

  40. Vazquez A, Dobrin R, Sergi D, Eckmann J, Oltvai Z, Barabási A (2004) The topological relationship between the large-scale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci 101(52):17940–17945

    Article  Google Scholar 

  41. Tsourakakis CE (2011) Counting triangles in real-world networks using projections. Knowl Inf Syst 26(3):501–520

    Article  Google Scholar 

  42. Kolountzakis M, Miller G, Peng R, Tsourakakis C (2012) Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Math 8(2):161–185

    Article  MathSciNet  MATH  Google Scholar 

  43. Seshadhri C, Pinar A, Kolda T (2013) Triadic measures on graphs: the power of wedge sampling. CoRR, abs/1202.5230

  44. Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  45. Celeux G, Chrétien S, Forbes F, Mkhadri A (2001) A component-wise EM algorithm for mixtures. J Comput Graph Stat 10(4):697–712

    Article  Google Scholar 

  46. Wallace C, Dowe D (1999) Minimum message length and kolmogorov complexity. Comput J 42(4):270–283

    Article  MATH  Google Scholar 

  47. Cover T, Thomas J, Wiley J et al (1991) Elements of information theory. Wiley, New York

    Book  MATH  Google Scholar 

  48. Leskovec J, Adamic L, Huberman B (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5–44

    Article  Google Scholar 

  49. Cartwright D, Harary F (1956) Structural balance: a generalization of Heider’s theory. Psychol Rev 63(5):277–293

    Article  Google Scholar 

  50. Kumar N, Satoor S, Buck I (2009) Fast parallel expectation maximization for gaussian mixture models on GPUs using CUDA. In: Proccedings of the 11th IEEE international conference on high performance computing and communications, pp 103–109

Download references

Acknowledgments

This work was supported by the General Research Fund (HKBU210410) from the Research Grant Council of the Hong Kong Special Administrative Region, China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, K., Cheung, W.K. & Liu, J. Detecting multiple stochastic network motifs in network data. Knowl Inf Syst 42, 49–74 (2015). https://doi.org/10.1007/s10115-013-0680-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0680-4

Keywords

Navigation