Abstract
Network motifs are referred to as the interaction patterns that occur significantly more often in a complex network than in the corresponding randomized networks. They have been found effective in characterizing many real-world networks. A number of network motif detection algorithms have been proposed in the literature where the interactions in a motif are mostly assumed to be deterministic, i.e., either present or missing. With the conjecture that the real-world networks are resulted from interaction patterns which should be stochastic in nature, the use of stochastic models is proposed in this paper to achieve more robust motif detection. In particular, we propose the use of a finite mixture model to detect multiple stochastic network motifs. A component-wise expectation maximization (CEM) algorithm is derived for the finite mixture of stochastic network motifs so that both the optimal number of motifs and the motif parameters can be automatically estimated. For performance evaluation, we applied the proposed algorithm to both synthetic networks and a number of online social network data sets and demonstrated that it outperformed the deterministic motif detection algorithm FANMOD as well as the conventional EM algorithm in term of its robustness against noise. Also, how to interpret the detected stochastic network motifs to gain insights on the interaction patterns embedded in the network data is discussed. In addition, the algorithm’s computational complexity and runtime performance are presented for efficiency evaluation.
Similar content being viewed by others
Notes
The formulation presented in this section is based on directed graphs, but can be easily modified for undirected graphs as well.
The relative errors \(e_{\lambda }\) and \(e_{\Theta }\) for the mixture models inferred based on different Z-score thresholds for initialization were calculated by comparing the model parameters \(\lambda \) and \(\Theta \) with the reference case of having the Z-score threshold set to 2.
References
Newman M (2003) The structure and function of complex networks. SIAM Rev 45(2):167–256
Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Kleinberg J (1999) Hubs, authorities, and communities. ACM Comput Surv 31(4):5–7
Leskovec J, Lang K, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World Wide Web, pp 631–640
Papadopoulos S, Kompatsiaris Y, Vakali A, Spyridonos P (2012) Community detection in social media. Data Min Knowl Discov 24(3):515–554
Comar P, Tan PN, Jain AK (2012) Simultaneous classification and community detection on heterogeneous network data. Data Min Knowl Discov 25(3):420–449
Shen-Orr S, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet 31(1):64–68
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827
Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, Ayzenshtat I, Sheffer M, Alon U (2004) Superfamilies of evolved and designed networks. Science 303(5663):1538–1541
Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746–1758
Wernicke S (2006) Efficient detection of network motifs. IEEE/ACM Trans Comput Biol Bioinform 3(4):347–359
Berg J, Michael L (2004) Local graph alignment and motif search in biological networks. Proc Natl Acad Sci 101(41):14689–14694
Jiang R, Tu Z, Chen T, Sun F (2006) Network motif identification in stochastic networks. Proc Natl Acad Sci 103(25):9404–9409
Zhao Q, Tian Y, He Q, Oliver N, Jin R, Lee W (2010) Communication motifs: A tool to characterize social communications. In: Proccedings of the 19th ACM international conference on information and, knowledge management, pp 1645–1648
Kovanen L, Karsai M, Kaski K, Kertész J, Saramäki J (2011) Temporal motifs in time-dependent networks. J Stat Mech Theory Exp 11(11):5–23
Juszczyszyn K, Kazienko P, Musial K (2008) Local topology of social network based on motif analysis. In: Proceedings of knowledge-based intelligent information and engineering systems, pp 97–105
Musial K, Juszczyszyn K (2009) Motif-based analysis of social position influence on interconnection patterns in complex social network. In: Procceedings of the 1st Asian conference on intelligent information and database systems, pp 34–39
Braha D, Bar-Yam Y (2009) Time-dependent complex networks: dynamic centrality, dynamic motifs, and cycles of social interactions. Theory, models and applications, adaptive networks, pp 39–50
Leskovec J, Huttenlocher D, Kleinberg J (2010) Signed networks in social media. In: Proceedings of the 28th international conference on human factors in computing systems, pp 1361–1370
Clauset A (2005) Finding local community structure in networks. Phys Rev E 72(2):132–137
Faust K (2007) Very local structure in social networks. Sociol Method 37(1):209–256
Kossinets G, Kleinberg J, Watts D (2008) The structure of information pathways in a social communication network. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 435–443
Birmele E (2012) Detecting local network motifs. Electron J Stat 6(1):908–933
Gallos LK, Diego R, Fredrik L, Shlomo H, Hernán MA (2012) How people interact in evolving online affiliation networks. Phys Rev X 2(3):14–31
Allan E, Turkett W, Fulp E (2009) Using network motifs to identify application protocols. In: Global telecommunications conference, pp 1–7
Callaghan D, Harrigan M, Carthy J, Cunningham P (2012) Network analysis of recurring YouTube spam campaigns. In: Proceedings of the 6th international AAAI conference on weblogs and social media, pp 531–534
Liu K, Cheung WK, Liu J (2011) Stochastic network motif detection in social media. In: Proccedings of ICDM workshop on data mining in networks, pp 949–956
Liu K, Cheung W K, Liu J (2012) Detecting multiple stochastic network motifs in network data. In: Proceedings of the 16th Pacific–Asia conference on knowledge discovery and data mining, pp 205–217
Figueiredo M, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396
Bruno F, Palopoli L, Rombo S (2010) New trends in graph mining: structural and node-colored network motifs. Int J Knowl Discov Bioinform 1(1):81–99
Wong E, Baur B, Quader S, Huang C (2012) Biological network motif detection: principles and practice. Briefings Bioinform 12(2):202–215
Schreiber F, Schwöbbermeyer H (2005) Frequency concepts and pattern detection for the analysis of motifs in networks. Transactions on computational systems biology, pp 89–104
Schreiber F, Schwöbbermeyer H (2005) MAVisto: a tool for the exploration of network motifs. Bioinformatics 21(17):3572–3574
Chen J, Hsu W, Lee M, Ng S (2006) Nemofinder: dissecting genome-wide protein–protein interactions with meso-scale network motifs. In: Proccedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 106–115
Kashani Z, Ahrabian H, Elahi E et al (2009) Kavosh: a new algorithm for finding network motifs. BMC Bioinform 10(1):318–329
Grochow J, Kellis M (2007) Network motif discovery using subgraph enumeration and symmetry-breaking. Research in computational, molecular biology, pp 92–106
Omidi S, Schreiber F, Masoudi-Nejad A (2009) Moda: an efficient algorithm for network motif discovery in biological networks. Genes Genet Syst 84(5):385–395
Fortin S (1996) The graph isomorphism problem. Technical Report 96–20, University of Alberta, Edomonton, Alberta, Canada
McKay B (2007) Nauty user’s guide (version 2.4). Computer Science Dept, Australian National University
Vazquez A, Dobrin R, Sergi D, Eckmann J, Oltvai Z, Barabási A (2004) The topological relationship between the large-scale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci 101(52):17940–17945
Tsourakakis CE (2011) Counting triangles in real-world networks using projections. Knowl Inf Syst 26(3):501–520
Kolountzakis M, Miller G, Peng R, Tsourakakis C (2012) Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Math 8(2):161–185
Seshadhri C, Pinar A, Kolda T (2013) Triadic measures on graphs: the power of wedge sampling. CoRR, abs/1202.5230
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38
Celeux G, Chrétien S, Forbes F, Mkhadri A (2001) A component-wise EM algorithm for mixtures. J Comput Graph Stat 10(4):697–712
Wallace C, Dowe D (1999) Minimum message length and kolmogorov complexity. Comput J 42(4):270–283
Cover T, Thomas J, Wiley J et al (1991) Elements of information theory. Wiley, New York
Leskovec J, Adamic L, Huberman B (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5–44
Cartwright D, Harary F (1956) Structural balance: a generalization of Heider’s theory. Psychol Rev 63(5):277–293
Kumar N, Satoor S, Buck I (2009) Fast parallel expectation maximization for gaussian mixture models on GPUs using CUDA. In: Proccedings of the 11th IEEE international conference on high performance computing and communications, pp 103–109
Acknowledgments
This work was supported by the General Research Fund (HKBU210410) from the Research Grant Council of the Hong Kong Special Administrative Region, China.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, K., Cheung, W.K. & Liu, J. Detecting multiple stochastic network motifs in network data. Knowl Inf Syst 42, 49–74 (2015). https://doi.org/10.1007/s10115-013-0680-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-013-0680-4