Abstract
Collaboration networks are elegant representations for studying the dynamical processes that shape the scientific community. In this paper, we are particularly interested in studying the local context of a node in collaboration network that can help explain the behavior of an author as an individual within the group and a member along with the group. The best representation of such local contextual substructures in a collaboration network are “network motifs”. In particular, we propose two fundamental goodness measures of such a group represented by a motif—productivity and longevity. We observe that while 4-semi clique motif, quite strikingly, shows highest longevity, the productivity of the 4-star and the 4-clique motifs is the largest among all the motifs. Based on the productivity distribution of the motifs, we propose a predictive model that successfully classifies the highly cited authors from the rest. Further, we study the characteristic features of motifs and show how they are related with the two goodness measures. Building on these observations, finally we propose two supervised classification models to predict, early in a researcher’s career, how long the group where she belongs to will persist (longevity) and how much the group would be productive. Thus this empirical study sets the foundation principles of a recommendation system that would forecast how long lasting and productive a given collaboration could be in future.

















Similar content being viewed by others
Notes
The code is publicly available in https://github.com/remenberl/KDDCup2013.
Note that, we refer to the rare class as negative class and frequently observed class as positive class in the rest of the paper.
References
Abbasi A, Chung KSK, Hossain L (2012) Egocentric analysis of co-authorship network structure, position and performance. Inf Process Manag 48(4):671–679
Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8(6):450–461
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: WSDM. ACM, New York, NY, USA, pp 635–644
Baras JS, Hovareshti P (2011) Motif-based communication network formation for task specific collaboration in complex environments. In: ACC 2011. IEEE, Kerala, India
Biryukov M (2008) Co-author network analysis in dblp: classifying personal names. In: MCO. Springer, Berlin, pp 399–408. http://link.springer.com/chapter/10.1007%2F978-3-540-87477-5_43
Chakraborty T, Ganguly N, Mukherjee A (2014) Automatic classification of scientific groups as productive: an approach based on motif analysis. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM 2014, Beijing, China, August 17–20, 2014, pp 130–137
Chakraborty T, Sikdar S, Tammana V, Ganguly N, Mukherjee A (2013) Computer science fields as ground-truth communities: their impact, rise and fall. In: Advances in social networks analysis and mining 2013, ASONAM ’13, Niagara, ON, Canada—August 25–29, 2013, pp 426–433
Chakraborty T, Tammana V, Ganguly N, Mukherjee A (2015) Understanding and modeling diverse scientific careers of researchers. J Informetr 9(1):69–78. doi:10.1016/j.joi.2014.11.008. http://www.sciencedirect.com/science/article/pii/S1751157714001102
Choobdar S, Ribeiro P, Bugla S, Silva F (2012) Comparison of co-authorship networks across scientific fields using motifs. In: ASONAM. IEEE Computer Society, Los Alamitos, pp 147–152
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Dascal M (1989) On the roles of context and literal meaning in understanding. Cogn Sci 13(2):253–257
Ding Y (2011) Scientific collaboration and endorsement: network analysis of coauthorship and citation networks. J Informetr 5(1):187–203
Hyun Yook S, Oltvai ZN, lszl Barabsi AL (2004) Functional and topological characterization of protein interaction networks. Proteomics 4:928–942
Han Y, Zhou B, Pei J, Jia Y (2009) Understanding importance of collaborations in co-authorship networks: a supportiveness analysis approach. In: SDM. Springer, Berlin, pp 1111–1122
Huang J, Zhuang Z, Li J, Giles CL (2008) Collaboration over time: characterizing and modeling network evolution. In: WSDM. ACM, New York, pp 107–116
Kairam SR, Wang DJ, Leskovec J (2012) The life and death of online groups: predicting group growth and longevity. In: Proceedings of the fifth ACM international conference on web search and data mining, WSDM '12. ACM, New York, NY, USA, pp 673–682. doi:10.1145/2124295.2124374
Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746–1758
Kronegger L, Mali F, Ferligoj A, Doreian P (2012) Collaboration structures in slovenian scientific communities. Scientometrics 90(2):631–647
Krumov L, Fretter C, Müller-Hannemann M, Weihe K, Hütt M (2011) Motifs in co-authorship networks and their relation to the impact of scientific publications. EPJB 84(4):535–540
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031
Liu J, Lei KH, Liu JY, Wang C, Han J (2013) Ranking-based name matching for author disambiguation in bibliographic data. In: Proceedings of the 2013 KDD cup 2013 workshop, KDD Cup ’13. ACM, New York, NY, USA, pp 8:1–8:8. doi:10.1145/2517288.2517296
Liu HT, Pei D, Wu Y (2012) A novel evolution model of collaboration network based on scale-free network. ICHIT 2:148–155
Lü L, Zhou T (2010) Link prediction in weighted networks: the role of weak ties. EPL 89(1):18,001. http://stacks.iop.org/0295-5075/89/i=1/a=18001
Martinez-Romo J, Robles G, González-Barahona JM, Ortuño-Perez M (2008) Using social network analysis techniques to study collaboration between a floss community and a company. In: OSS. Springer, Berlin, pp 171–186
Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827
Newman MEJ (2001) The structure of scientific collaboration networks. PNAS 98(2):404–409
Newman M (2004) Coauthorship networks and patterns of scientific collaboration. PNAS 101:5200–5205
Pan RK, Saramäki J (2011) The strength of strong ties in scientific collaboration networks. CoRR. abs/1106.5249
Prill RJ, Iglesias PA, Levchenko A (2005) Dynamic properties of network motifs contribute to biological network organization. PLoS Biol 3(11):e343
Rennie JDM, Srebro N (2005) Fast maximum margin matrix factorization for collaborative prediction. In: ICML. ACM, New York, pp 713–719
Hassan S-U, Ichise R (2009) Discovering research domains using distance matrix and co-authorship network. SDM 3:1252–1257
Said YH, Wegman EJ, Sharabati WK, Rigsby JT (2008) Social networks of author-coauthor relationships. Comput Stat Data Anal 52(4):2177–2184
Shi X, Wu L, Yang H (2008) Scientific collaboration network evolution model based on motif emerging. In: ICYCS. IEEE Computer Society, Washington, pp 2748–2752
Tambayong L (2007) Dynamics of network formation processes in the co-author model. J Artif Soc Soc Sim 10(3):2. http://dblp.uni-trier.de/db/journals/jasss/jasss10.html#Tambayong07
Wernicke S (2005) A faster algorithm for detecting network motifs. In: WABI. Springer, Berlin, pp 165–177
Wernicke S, Rasche F (2006) Fanmod: a tool for fast network motif detection. Bioinformatics 22(9):1152–1153
Wu, W., Han, Y., Li, D.: The topology and motif analysis of journal citation networks. In: CSSE, pp. 287–293. IEEE Computer Society (2008). http://dblp.uni-trier.de/db/conf/csse/csse2008-1.html#WuHL08
Wu G, Harrigan M, Cunningham P (2012) Classifying wikipedia articles using network motif counts and ratios. In: Proceedings of the eighth annual international symposium on wikis and open collaboration, WikiSym ’12. ACM, New York, NY, USA, pp 12:1–12:10
Yeang CH, Huang LC, Liu WC (2012) Recurrent structural motifs reflect characteristics of distinct networks. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012), ASONAM ’12. IEEE Computer Society, Washington, DC, USA, pp 551–557. doi:10.1109/ASONAM.2012.94
Yu K, Lafferty J, Zhu S, Gong Y (2009) Large-scale collaborative prediction using a nonparametric random effects model. In: ICML. ACM, New York, pp 1185–1192
Author information
Authors and Affiliations
Corresponding author
Additional information
T. Chakraborty was financially supported by Google India Ph.D. Fellowship.
Rights and permissions
About this article
Cite this article
Chakraborty, T., Ganguly, N. & Mukherjee, A. An author is known by the context she keeps: significance of network motifs in scientific collaborations. Soc. Netw. Anal. Min. 5, 16 (2015). https://doi.org/10.1007/s13278-015-0255-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-015-0255-3