Skip to main content
Log in

An author is known by the context she keeps: significance of network motifs in scientific collaborations

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Collaboration networks are elegant representations for studying the dynamical processes that shape the scientific community. In this paper, we are particularly interested in studying the local context of a node in collaboration network that can help explain the behavior of an author as an individual within the group and a member along with the group. The best representation of such local contextual substructures in a collaboration network are “network motifs”. In particular, we propose two fundamental goodness measures of such a group represented by a motif—productivity and longevity. We observe that while 4-semi clique motif, quite strikingly, shows highest longevity, the productivity of the 4-star and the 4-clique motifs is the largest among all the motifs. Based on the productivity distribution of the motifs, we propose a predictive model that successfully classifies the highly cited authors from the rest. Further, we study the characteristic features of motifs and show how they are related with the two goodness measures. Building on these observations, finally we propose two supervised classification models to predict, early in a researcher’s career, how long the group where she belongs to will persist (longevity) and how much the group would be productive. Thus this empirical study sets the foundation principles of a recommendation system that would forecast how long lasting and productive a given collaboration could be in future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. http://www-personal.umich.edu/~mejn/.

  2. The code is publicly available in https://github.com/remenberl/KDDCup2013.

  3. http://www.minet.uni-jena.de/~wernicke/motifs/.

  4. Note that, we refer to the rare class as negative class and frequently observed class as positive class in the rest of the paper.

  5. http://chasen.org/~taku/software/yamcha/.

  6. http://chasen.org/~taku/software/TinySVM/.

  7. http://www.cs.cornell.edu/home/kleinber/.

  8. http://www.cs.uiuc.edu/~hanj/.

  9. http://www.cs.berkeley.edu/~jordan/.

  10. http://ciir.cs.umass.edu/~allan/.

  11. http://users.ecs.soton.ac.uk/nrj/.

  12. http://journals.aps.org/datasets.

References

  • Abbasi A, Chung KSK, Hossain L (2012) Egocentric analysis of co-authorship network structure, position and performance. Inf Process Manag 48(4):671–679

    Article  Google Scholar 

  • Alon U (2007) Network motifs: theory and experimental approaches. Nat Rev Genet 8(6):450–461

    Article  Google Scholar 

  • Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: WSDM. ACM, New York, NY, USA, pp 635–644

  • Baras JS, Hovareshti P (2011) Motif-based communication network formation for task specific collaboration in complex environments. In: ACC 2011. IEEE, Kerala, India

  • Biryukov M (2008) Co-author network analysis in dblp: classifying personal names. In: MCO. Springer, Berlin, pp 399–408. http://link.springer.com/chapter/10.1007%2F978-3-540-87477-5_43

  • Chakraborty T, Ganguly N, Mukherjee A (2014) Automatic classification of scientific groups as productive: an approach based on motif analysis. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining, ASONAM 2014, Beijing, China, August 17–20, 2014, pp 130–137

  • Chakraborty T, Sikdar S, Tammana V, Ganguly N, Mukherjee A (2013) Computer science fields as ground-truth communities: their impact, rise and fall. In: Advances in social networks analysis and mining 2013, ASONAM ’13, Niagara, ON, Canada—August 25–29, 2013, pp 426–433

  • Chakraborty T, Tammana V, Ganguly N, Mukherjee A (2015) Understanding and modeling diverse scientific careers of researchers. J Informetr 9(1):69–78. doi:10.1016/j.joi.2014.11.008. http://www.sciencedirect.com/science/article/pii/S1751157714001102

  • Choobdar S, Ribeiro P, Bugla S, Silva F (2012) Comparison of co-authorship networks across scientific fields using motifs. In: ASONAM. IEEE Computer Society, Los Alamitos, pp 147–152

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  • Dascal M (1989) On the roles of context and literal meaning in understanding. Cogn Sci 13(2):253–257

    Article  Google Scholar 

  • Ding Y (2011) Scientific collaboration and endorsement: network analysis of coauthorship and citation networks. J Informetr 5(1):187–203

    Article  Google Scholar 

  • Hyun Yook S, Oltvai ZN, lszl Barabsi AL (2004) Functional and topological characterization of protein interaction networks. Proteomics 4:928–942

    Article  Google Scholar 

  • Han Y, Zhou B, Pei J, Jia Y (2009) Understanding importance of collaborations in co-authorship networks: a supportiveness analysis approach. In: SDM. Springer, Berlin, pp 1111–1122

  • Huang J, Zhuang Z, Li J, Giles CL (2008) Collaboration over time: characterizing and modeling network evolution. In: WSDM. ACM, New York, pp 107–116

  • Kairam SR, Wang DJ, Leskovec J (2012) The life and death of online groups: predicting group growth and longevity. In: Proceedings of the fifth ACM international conference on web search and data mining, WSDM '12. ACM, New York, NY, USA, pp 673–682. doi:10.1145/2124295.2124374

  • Kashtan N, Itzkovitz S, Milo R, Alon U (2004) Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746–1758

    Article  Google Scholar 

  • Kronegger L, Mali F, Ferligoj A, Doreian P (2012) Collaboration structures in slovenian scientific communities. Scientometrics 90(2):631–647

    Article  Google Scholar 

  • Krumov L, Fretter C, Müller-Hannemann M, Weihe K, Hütt M (2011) Motifs in co-authorship networks and their relation to the impact of scientific publications. EPJB 84(4):535–540

    Article  Google Scholar 

  • Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031

    Article  Google Scholar 

  • Liu J, Lei KH, Liu JY, Wang C, Han J (2013) Ranking-based name matching for author disambiguation in bibliographic data. In: Proceedings of the 2013 KDD cup 2013 workshop, KDD Cup ’13. ACM, New York, NY, USA, pp 8:1–8:8. doi:10.1145/2517288.2517296

  • Liu HT, Pei D, Wu Y (2012) A novel evolution model of collaboration network based on scale-free network. ICHIT 2:148–155

    Google Scholar 

  • Lü L, Zhou T (2010) Link prediction in weighted networks: the role of weak ties. EPL 89(1):18,001. http://stacks.iop.org/0295-5075/89/i=1/a=18001

  • Martinez-Romo J, Robles G, González-Barahona JM, Ortuño-Perez M (2008) Using social network analysis techniques to study collaboration between a floss community and a company. In: OSS. Springer, Berlin, pp 171–186

  • Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2002) Network motifs: simple building blocks of complex networks. Science 298(5594):824–827

    Article  Google Scholar 

  • Newman MEJ (2001) The structure of scientific collaboration networks. PNAS 98(2):404–409

    Article  MATH  Google Scholar 

  • Newman M (2004) Coauthorship networks and patterns of scientific collaboration. PNAS 101:5200–5205

    Article  Google Scholar 

  • Pan RK, Saramäki J (2011) The strength of strong ties in scientific collaboration networks. CoRR. abs/1106.5249

  • Prill RJ, Iglesias PA, Levchenko A (2005) Dynamic properties of network motifs contribute to biological network organization. PLoS Biol 3(11):e343

    Article  Google Scholar 

  • Rennie JDM, Srebro N (2005) Fast maximum margin matrix factorization for collaborative prediction. In: ICML. ACM, New York, pp 713–719

  • Hassan S-U, Ichise R (2009) Discovering research domains using distance matrix and co-authorship network. SDM 3:1252–1257

    Google Scholar 

  • Said YH, Wegman EJ, Sharabati WK, Rigsby JT (2008) Social networks of author-coauthor relationships. Comput Stat Data Anal 52(4):2177–2184

    Article  MathSciNet  Google Scholar 

  • Shi X, Wu L, Yang H (2008) Scientific collaboration network evolution model based on motif emerging. In: ICYCS. IEEE Computer Society, Washington, pp 2748–2752

  • Tambayong L (2007) Dynamics of network formation processes in the co-author model. J Artif Soc Soc Sim 10(3):2. http://dblp.uni-trier.de/db/journals/jasss/jasss10.html#Tambayong07

  • Wernicke S (2005) A faster algorithm for detecting network motifs. In: WABI. Springer, Berlin, pp 165–177

  • Wernicke S, Rasche F (2006) Fanmod: a tool for fast network motif detection. Bioinformatics 22(9):1152–1153

    Article  Google Scholar 

  • Wu, W., Han, Y., Li, D.: The topology and motif analysis of journal citation networks. In: CSSE, pp. 287–293. IEEE Computer Society (2008). http://dblp.uni-trier.de/db/conf/csse/csse2008-1.html#WuHL08

  • Wu G, Harrigan M, Cunningham P (2012) Classifying wikipedia articles using network motif counts and ratios. In: Proceedings of the eighth annual international symposium on wikis and open collaboration, WikiSym ’12. ACM, New York, NY, USA, pp 12:1–12:10

  • Yeang CH, Huang LC, Liu WC (2012) Recurrent structural motifs reflect characteristics of distinct networks. In: Proceedings of the 2012 international conference on advances in social networks analysis and mining (ASONAM 2012), ASONAM ’12. IEEE Computer Society, Washington, DC, USA, pp 551–557. doi:10.1109/ASONAM.2012.94

  • Yu K, Lafferty J, Zhu S, Gong Y (2009) Large-scale collaborative prediction using a nonparametric random effects model. In: ICML. ACM, New York, pp 1185–1192

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tanmoy Chakraborty.

Additional information

T. Chakraborty was financially supported by Google India Ph.D. Fellowship.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakraborty, T., Ganguly, N. & Mukherjee, A. An author is known by the context she keeps: significance of network motifs in scientific collaborations. Soc. Netw. Anal. Min. 5, 16 (2015). https://doi.org/10.1007/s13278-015-0255-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-015-0255-3

Keywords

Navigation